MASTIFF Analysis of APT1

While there are many tools and resources for performing automated dynamic analysis on malware samples there are few that focus on automation of static analysis. At Shmoocon this year we were please to find that there is a project focused on this specifically called MASTIFF.

Created by Tyler Hudak (@SecShoggoth) MASTIFF aims to be the framework for automating static analysis. To learn more about this project please visit the SourceForge page and their blog. Over at TekDefense I demoed utilizing MASTIFF with Maltrieve via video format. If interested please check out TekTip ep 23.

Of course the largest news story of last week was Mandiant’s report on APT1. As Mandiant released IOCs along with this report, users quickly discovered that many of the hashes mentioned in the report were samples available on VirusShare. VirusShare was nice enough to put out a torrent that has 281 samples matching APT1 hashes.

A better use case could not present itself. Let’s show how efficient MASTIFF is at performing static analysis on a large number of samples. If you would like to play along at home download the samples from VirusShare. If you don’t have an account email VirusShare to request one.

To appreciate what MASTIFF does for us, you must understand the manual process of basic static analysis first. Typical analysts will hash a sample first, then run a file identification tool against it, and then based on the file type the analyst will run tools specific to that file against it. As you can probably imagine, this can take time. For 200 samples this could take hours, perhaps days if done manually.

I am not going to get into how to install and configure MASTIFF as there are other documents and articles that cover that. I will instead show you how to run MASTIFF against the APT1 samples and review the results. With all the APT1 samples downloaded and extracted to a directory (I used /opt/malware/), you can simply run:

sudo filename

This will run MASTIFF against a single sample. MASTIFF currently (v0.5) does not natively run against more than one sample at a time. So while you could run MASTIFF individually for each of the almost 300 samples, I wouldn’t recommend it. What I did is created a quick python script that would run against any file in a specific directory, in my case /opt/malware/.

Here is the script:

import os
# MASTIFF Autorun
# @TekDefense
# Quick script to autorun samples from maltrieve to MASTIFF
malwarePath = '/opt/malware/'
for r, d, f in os.walk(malwarePath):
   for files in f:
      malware = malwarePath + files
      print malware
      os.system ('' + ' ' + malware)

In order for this to run you must have this script in the same directory as MASTIFF’s This will now run MASTIFF against all of the files in that directory. For me it took MASTIFF 4 minutes and 50 seconds to churn though all 281 samples.

Pro-Tip: In your MASTIFF config set the zip password to infected to have MASTIFF auto extract most shared malware samples. Also when working on a project like this, give yourself a separate work log directory than your normal use in order to keep yourself more organized.

With all of these samples run through MASTIFF, get yourself acquainted with the samples by looking through the sqllite database (mastiff.db by default) with your favorite sqllite manager. I prefer to use the Firefox plugin “SQLite Manger.” In this database there will be two main tables “files” and “mastiff”. The files table will have information about the path, size, and frequency of the samples. The mastiff table on the other hand will show you the hashes, file type, and fuzzy hash. Looking through the file types you can start to get a feel for how this group (APT1) packages their malware.

Running the following SQL Query will show a count of the file types used:

SELECT type, count(*) AS total
FROM mastiff
ORDER BY total

The results of this query show that the majority of the samples are simply PE32 standard executables, with a small amount being archives or containing archives.

"['Generic', 'EXE']","274"
"['Generic', 'EXE', 'ZIP']","7"
"['Generic', 'ZIP']","1"

With a better understanding of what we are looking at let’s take a look at the analysis results for one of the samples. For those still playing at home I will be looking at eef80511aa490b2168ed4c9fa5eafef0. The results file for this sample contains the following files:

  • fuzzy.txt: This will show us the fuzzy hash of the sample as well as tell us how close of a match it is to other samples we have scanned.
  • mastiff.log: This will show a log of mastiff running. Any errors that occurred during the scan will show here.
  • mastiff-run.config: This is a copy of the config file from mastiff that it used to scan the sample.
  • peinfo-full.txt: Will show the full PE details.
  • peinfo-quick.txt: Will show a condensed version of the PE details.
  • strings.txt: This is a dump of the strings command against the file.
  • VirusShare_eef80511aa490b2168ed4c9fa5eafef0.VIR: This is a copy of the actual sample that was scanned.

If this file was a PDF, there would be a different set of artifacts to work with as MASTIFF would have analyzed the sample with different tools. In other files you may sometimes see a sig.txt which is a copy of certificate details if any were found, also you may see a resources folder if MASTIFF pulled out any resources such as icons or cursors.

Running cat against fuzzy.txt shows us that this particular sample is an 85% match for another APT1 sample we scanned fb671e6de6e301c892d2fdaa58f9cd9a:

[email protected]:/opt/work/apt1/log/eef80511aa490b2168ed4c9fa5eafef0$
cat fuzzy.txt
Fuzzy Hash:
This file is similar to the following files:
MD5                              Percent
fb671e6de6e301c892d2fdaa58f9cd9a 85

A cat on peinfo-quick shows us a lot about the DLLs that this malware leverages, and the particular functions it is likely to call. The first few are very telling to what we can expect this file
to do:

KERNEL32.dll   Sleep               0x406010
KERNEL32.dll   GetTempPathA        0x406014
KERNEL32.dll   GetTempFileNameA    0x406018
KERNEL32.dll   CreateProcessA      0x40601c
ADVAPI32.dll   RegOpenKeyExA       0x406000
ADVAPI32.dll   RegSetValueExA      0x406004
ADVAPI32.dll   RegCloseKey         0x406008
WININET.dll    InternetOpenA       0x40607c
WININET.dll    InternetOpenUrlA    0x406080
WININET.dll    InternetReadFile    0x406084
WININET.dll    InternetCloseHandle 0x406088
urlmon.dll     URLDownloadToFileA  0x406090

This is great info to have when we want to setup an environment for dynamic analysis later. For instance we know that the sample will attempt to create files, create processes, manipulate the registry and try to connect to a URL.

Reviewing strings.txt we are able to pick out some more clues as to what this sample will do. Some examples are:

5728 A http://www.rbaparts(.)com/images/li.gif : This will be a URL of interest
56dc A IMSCMig.exe : Perhaps this is the process it will create.

While basic static analysis techniques are not going to give us the full story, we are able to learn much that can be applied to our basic dynamic analysis as well as advanced static and dynamic analysis.

For your convenience I have made the APT1 MASTIFF.  Results available over at


Today’s post pic is from

4 comments for “MASTIFF Analysis of APT1

  1. February 26, 2013 at 1:30 pm

    I wrote an article for Nova InfoSec on using @SecShoggoth’s MASTIFF to analyze the 281 APT1 samples from @VXShare.

  2. February 26, 2013 at 3:29 pm


  3. January 6, 2016 at 12:48 am

    Hey, I think your site might be having browser compatibility issues.

    When I look att your blog site in Safari, it looks fine
    but when opening in Internet Explorer, it has some overlapping.
    I just wanted to give you a quick heads up! Other then that, terrific blog!

  4. October 14, 2016 at 11:59 pm

    Hmm it looks like your website ate my first comment (it was extremely long) so I guess I’ll just sum it up what I wrote and say, I’m thoroughly enjoying your blog.

    I as well am an aspiring blog writer but I’m still new
    to the whole thing. Do you have any tips for rookie blog writers?
    I’d genuinely appreciate it.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.