Improved Artifact Scanner, Malware Analyst’s Cookbook

I like to take apart malware. One of my “go to” references for analyzing malware is the Malware Analyst’s Cookbook written by Michael Hale Ligh, Steven Adair, Blake Hartstein, and Matthew Richard. One thing I found useful in this book was the Artifact Scanner. The Artifact Scanner lets you build your own database of malware you can use use in your investigations. I am going to explain how I modified the code and the way I automated the updates to the Artifact Scanner database.

If you have not read the book, it is organized by “recipes”. The recipe lists the chapter and subsection with in that chapter. For instance, the recipe 4-12 (Artifact Scanner), is in chapter 4, subsection 12. The Artifact Scanner is what I am going to talk about today. It is a Python program that will download malware reports from ThreatExpert and insert the information from the report into a database. The program gets the MD5 of the malware file, registry keys that have changed, files that have been modified, and mutexes related to the malware. This is very useful for creating your own database of malware. The authors also explain how to change the python script into an executable using py2exe. You can then use this executable to scan a machine that does not have python installed for matches to the artifacts in your database.

The first thing I noticed, and the authors pointed this out in the book, was that the Artifact Scanner is updated from ThreatExpert’s main page. This can lead to many false positive artifacts in the database. The issue is that not all files uploaded to ThreatExpert are malicious in nature. Someone might be checking if a legitimate file is infected, or trying to find out what an unknown file does.

Fortunately, ThreatExpert organizes uploaded files into two categories, “known bad” and “suspicious”. I was more interested in a database with “known bad” malware. I modified the Artifact Scanner to only pull reports from the “Known Bad” page. Below is the change that I made to accomplish this.

Artifact Scanner original file

First open the script in your favorite text editor, I prefer gedit. Turn on line numbers on your editor so you can find the line to modify easily. In the bulkimport section, there is a line that has the GET request for the report page on ThreatExpert. See the sample below. In order to only get “know bad” samples, line 158 needs changed to look like the below sample. If you add  %d&sl=1′  after the equal sign, then the database will only be updated from the “Known Bad” page of ThreatExpert.

Artifact Scanner modified file


That is all that needs modified in the file. Be sure to save your changes. The next step is to automate the update process. I have my Artifact Scanner loaded on Ubuntu, so these directions are specific to my situation. These directions should work on most Linux systems. If you want to port it to Windows, you will have to create a scheduled task to run the update.

Open your text editor and add the following lines:

/home/john/ –bulk=1
/home/john/ –bulk=2
/home/john/ –bulk=3
/home/john/ –bulk=4
/home/john/ –bulk=5
/home/john/ –bulk=6
/home/john/ –bulk=7
/home/john/ –bulk=8
/home/john/ –bulk=9
/home/john/ –bulk=10
echo “Database update sucessful: $(date)” >> /tmp/artifactdb.log

Be sure to change the path to where you have your files located. If you don’t know, type pwd at the command prompt. That will tell you what directory you are in.

Save the file as This will make a simple bash script to run the commands to update your database. It will also log the update to the artifactdb.log file. (The last line is optional and can be deleted.) This script needs to be in the same folder as the other files for the artifact scanner. I am sure there are more eloquent ways to accomplish the same task, but this one works for me.

You will have to make the script executable by running the following command in the directory where you saved

chmod 755

You can then run the script by typing the following in the directory where the script is located.


After the update script is created and it runs correctly from the command line, it is a simple matter to create a cron job to run this script every 12 hours. I settled on 12 hours because if I updated the database once a day I would miss some reports. To set up the cron job to run every day at 12 hour intervals, follow these steps.

Open a terminal and type the following:

crontab -e

If you get asked which editor to use, choose nano, which is the default on Ubuntu. If you are more familiar with another editor, choose that one. I added the below line to my crontab file. You will have to edit the path to reflect where your file is located. The Ubuntu documentation for adding cron jobs is here.

0 6,18 * * * /home/john/

Save your file and exit the editor. The script will now run everyday at 6AM and 6PM your local time. Every time it runs it will add an entry into the artifactdb.log file located in the /tmp folder.

Now you have a script to get “Known Bad” malware artifacts into a database for your use. The authors of this script did a good job in making it customizable. You can add other sites or pages as you see fit. If you have any thoughts or comments, please let me know.


This entry was posted in Computer Security, Malware. Bookmark the permalink.

One Response to Improved Artifact Scanner, Malware Analyst’s Cookbook

  1. Pingback: Improved artifact scanner from Malware Analysis Cookbook « vctecnologia