Columbo - A Computer Forensic Analysis Tool Used To Simplify And Identify Specific Patterns In Compromised Datasets
Columbo is a computer forensic analysis tool used to simplify and identify specific patterns in compromised datasets. It breaks down data to small sections and uses pattern recognition and machine learning models to identify adversaries behaviour and their possible locations in compromised Windows platforms in a form of suggestions. Currently Columbo operates on Windows platform.
Columbo depends on volatility 3, autorunsc.exe and sigcheck.exe to extract data. Therefore users must download these dependent tools and place them under \Columbo\bin folder. Please Make sure you Read and Understand the license section (or License.txt file) before you download anything. The output (data) generated by these tools are automatically piped to Columbo's main engine. It breaks it down to small sections, pre-process it and applies machine learning models to classify the location of the compromised system, executable files and other behaviours.
Get started with Columbo
Videos
- Before you start Columbo Watch
- Memory forensics using Columbo Memory-forensics
Installation and Configuration
Executable -Binary
- Download and install python 3.7 or 3.8 (not tested with 3.9). Make sure you add python.exe to the PATH during the installation.
- Download latest binary Columbo release, under Releases
- Download each of the following and place them under \Columbo\bin.
- Volatility 3 source code. Columbo does not support Volatility 2. Please make sure you also download Symbol table packs for windows, unzip it and put it under \Columbo\bin\volatility3-master\volatility\symbols.
- Download both autorunsc.exe and sigcheck.exe
NB: To avoid errors, The directory structure must be like \Columbo\bin\volatility3-master , \Columbo\bin\autorunsc.exe and \Columbo\bin\sigcheck.exe
Finally double click on "main.exe" under \Columbo.
Source Code
- Download and install python 3.7 or 3.8 (not tested with 3.9). Make sure you add python.exe to the PATH during the installation.
- Download the latest release version of Columbo - source code.
- Double click on install-prerequisites.bat to install all the required packages.
- Download each of the following and place them unde \Columbo\bin.
- Volatility 3 source code. Columbo does not support Volatility 2. Please make sure you also download Symbol table packs for windows, unzip it and put it under \Columbo\bin\volatility3-master\volatility\symbols
- Download both autorunsc.exe and sigcheck.exe.
NB: To avoid errors, The directory structure must be like this \Columbo\bin\volatility3-master , \Columbo\bin\autorunsc.exe and \Columbo\bin\sigcheck.exe
Finally go to cmd and issue python.exe \Columbo\main.py
Columbo and Machine Learning
Columbo uses data preprocessing to organise the data and machine learning models to identify suspicious behaviours. Its outputs are either 1 (suspicious) or 0 (genuine) -in a form of suggestions purely to assist digital forensic examiners in their decision making. We have trained the models with different examples to maximise accuracy and used different approaches to minimise false positives. However, false positives (false detection) are still experienced and therefore we are committed to update the models periodically.
False Positive
It's not easy to reduce false positives (false detection), especially when we deal with machine learning. The output generated by machine learning models might be false positive depending on the quality of the data used to train the models. However, to assist forensic examiners in their investigation, Columbo generates percentage scores for each 1 (suspicious) and 0 (genuine). Such approach helps the examiners to pick and choose the path, command or processes that Columbo classifies them as suspicious.
Options to Select
Option 2
Live analysis -files and process traceability. This option analyses running Windows processes to identify running malicious activities if any. Columbo uses autorunsc.exe to extract the data from the machine, the outputs are piped to Machine Learning models and pattern recognition engines to classify suspicious activities. Later the outputs are saved under \Columbo\ML\Step-2-results in a form of excel files for further analysis. Furthermore, users are given options to examine running processes. The result contains information such as process traceability, commands that are associated with each process -if applicable and whether or not, the processes are responsible for executing new processes.
Option 3
Scan and analyse Hard Disk Image File (.vhdx): This option takes paths of mounted Hard Disk Image of Windows. It uses sigcheck.exe to extract the data from the file systems. Then the results are piped into Machine Learning models to classify suspicious activities. Further the outputs are saved under \Columbo\ML\Step-3-results in a form of excel files.
Option 4
Memory Forensics. In this option, Columbo takes the path of the memory image and following options are produced for users to select.
-
Memory Information: Volatility 3 is used to extract information about the image.
-
Processes Scan: Volatility 3 is used to extract process, dll and handle information of each process. Then, Columbo uses grouping and clustering mechanisms to group each process according to their mother processes. This option is later used by the process traceability under Anomaly Detection option.
-
process Tree: Volatility 3 is used to extract process tree of the processes.
-
Anomaly Detection and Process Traceability: Volatility 3 is used to extract a list of Anomaly Detection processes. However, Columbo gives an option called Process Traceability to separately examine each process and collectively produces the following information.
- Paths of the executable files and associated commands.
- Using Machine Learning models to determine the legitimacy of the identified processes.
- Trace each process all the way back to their root processes (complete path) and their execution dates and time.
- Identify if the process is responsible for executing other processes i.e. is it going to be a mother process of new processes or not.
- It extracts, handles and dlls information of each process and presents them with the rest of the information.
Via: feedproxy.google.com