Wednesday, October 8, 2014

Match The Hash

                          // Match The Hash//


        Matches The Hash values to files on a machine.

    ~ What is Match The Hash?

Python script that can help find malware on your machine.
        Sample data is created as: <name>,<size>,<MD5>
        That data is then used to search other machines.

        The search runs much faster if file size and hash values are included.
Name is ignored, file size is matched first,
then hashes and attempts to match the hash values.

    ~ Examples

        Search your whole hard drive with hash only. -md5 d378bffb70923139d6a4f546864aa61c

It will go much faster if you include file size. -size 179712 -md5 d378bffb70923139d6a4f546864aa61c

Name can be included but it is ignored in the search.
-name malware1.exe -size 179712 -md5 d378bffb70923139d6a4f546864aa61c

        Create a large sample of data from all files in a folder.
-hf = hash folder -hf <Path_to_folder> -o samples.csv

Then use this folder as your input. -i -i samples.csv -o results.csv

    ~ Notes

        Compiled using pyinstaller.

If no output file is assigned using -o,
Output file Match_The_Hash.csv will be created at current working directory.    = Works on Windows,Linux,Mac
Match_The_Hash32.exe = 32bit Windows
Match_The_Hash64.exe = 64bit Windows

        The example is notepad.exe to test if it's working on Windows 7 machine. -size 179712 -md5 d378bffb70923139d6a4f546864aa61c

    ~ Usage

usage: [-h] [-name FILENAME] [-size FILESIZE]
                           [-md5 MD5HASH] [-sha1 SHA1HASH] [-hf HASH_FOLDER]
                           [-hhd] [-hhdsha1] [-p PATH] [-i INFILE]
                           [-o OUTFILE]

Matches The Hash values to files on a machine.

optional arguments:
 -h, --help       show this help message and exit
 -name FILENAME   Name of single file to search
 -size FILESIZE   Size of file to search. Faster than hash only search.
 -md5 MD5HASH     MD5 hash value to search.
 -sha1 SHA1HASH   SHA1 hash value to search.
 -hf HASH_FOLDER  Hash contents of a folder.
 -hhd             Hash contents of an entire hard drive: MD5.
 -hhdsha1         Hash contents of an entire hard drive: SHA1.
 -p PATH          Path to start in, else search whole disk.
 -i INFILE        CSV or text file in this format 'malwarename1.exe,size,MD5' or SHA1 hash value.'
 -o OUTFILE       Output file.

MatchTheHash goes much faster if file size is included.
 File sizes are matched first and then hash values.

 -size 179712 -md5 d378bffb70923139d6a4f546864aa61c
 -hf folder_path -o samples.csv (Hash folder that contains malware samples.)
  -i samples.csv -o results.csv (Sample data is input to search for matches.)



Monday, June 30, 2014

That Bytes!

Large amounts of code can do a lot of things.
Small amounts of code can do less.
Common sense right.

Take for example the size of a biological virus and computer virus.
The chart at the bottom of this wiki shows the sizes of biological viruses.

             1,759 bytes = Small Biological Virus
             9,000 bytes = Small Computer Virus
           23,000 bytes =  Errors found in DNA of Lung Cancer.
          226,000 bytes = Zeus Computer Virus.
       2,470,000 bytes = Larger Biological Virus
     20,758,528 bytes = Chinese APT Computer virus looked at earlier.

3,200,000,000 bytes  = RAM size a 32-bit Operating System can use efficiently.
3,200,000,000 bytes  = Human code (DNA is 3.2 Gigabytes of base-pairs)

This 3.2 Gigs is kept in each cell of your 100 Trillion cell body.
I'm assuming only humans are reading this.
Cancer is basically damaged code.

A small virus might just be a downloader with the only purpose of downloading larger files.
A large virus could be filled with many different capabilities.

Capabilities include:
  • multiple exploits
  • multiple tools for stealing information
  • multiple ways of controlling the host
  • multiple ways of getting around defenses and evading detection
  • multiple ways of preventing its removal
When deciding to look across an entire enterprise by hash value such as MD5,
It would go much faster if you search by byte size first.

Then hash only the files that match the given file size.
This saves time by not hashing every file on the network.

It would be interesting if we could look across our entire body by byte size and then look for cancer in only the cells that have abnormalities.

We need a lot more people to be literate in computer and DNA code if we are to solve our most difficult problems.

Sunday, June 15, 2014

Mandiant APT1 Import Hash

Mandiant released an article on the importance of Import Hashing. (Imphash)
The article listed hash samples reported as APT1.

Google has hits for these hashes on Malwr.
Using these URL hits MalwareViz created the below graphs.

The graphs looks similar with only one callback.
All are currently detected by an AntiVirus.
Some show one dropped file.

Imphash: 2c26ec4a570a502ed3e8484295581989
Note: This file crashed during  execution, so no callback.

Imphash: b722c33458882a1ab65a13e99efe357e

Imphash: 2d24325daea16e770eb82fa6774d70f1

Imphash: 0d72b49ed68430225595cc1efb43ced9

Imphash: 959711e93a68941639fd8b7fba3ca28f

 Imphash: 4cec0085b43f40b4743dc218c585f2ec

Imphash: 3b10d6b16f135c366fc8e88cba49bc6c

Imphash: 4f0aca83dfe82b02bbecce448ce8be00

Imphash: ee22b62aa3a63b7c17316d219d555891

Imphash: a1a42f57ff30983efda08b68fedd3cfc

Imphash: 7276a74b59de5761801b35c672c9ccb4

Wednesday, June 4, 2014

Zeus and CryptoLocker creator on FBI Wanted List.

USA TODAY article.

FBI states Evgeniy Mikhailovich Bogachev, aka "Slavik", created Gameover Zeus and CryptoLocker.

Gameover Zeus -

Malware is commonly distributed through mass e-mailing targets. Someone in the organization will open the attachment or click the link and infect their machine with a virus.

This particular malware will watch and record every keystroke.  It will watch for and steal banking credentials and send that information to a remote server.

Here we see the remote servers in Blue.
The virus found on the compute is in Orange at the bottom.

Gameover Zeus Bogachev "Slavik" - Malware Visualizer

Zeus -

Some malware will try to hide their malicious Internet Traffic with regular looking traffic. Some will check to see if they have Internet access before unpacking and sending traffic to their real locations. This graph shows Internet Traffic to legitimate Google sites of and There is also malicious Internet Traffic to an IP address and URL.

The ".tmp" file is usually deleted as a temporary holding place for the ".exe" file. A ".bat" file can be many things but it is included in malware that is coded to delete the original file after the original file has been renamed and copied to a hidden directory location.
 Zeus Malware Visualizer

CryptoLocker -

Notice the large amount of Internet Traffic.
Most of it is no longer associated to an IP address which is why it's pointing to an empty node.

The group at the bottom are still communicating to a Command and Control Server at IP address
Of that group half have the word Sinkhole.

CryptoLocker Malware Visualizer

Monday, June 2, 2014

Who's on First? Understanding Bases.

There was a question about Base64 so lets talk about bases.
A "base" is how many "things" you have to communicate with.

In English you have 26 letters.
English is Base26, if you only use lower case "abcdefghijklmnopqrstuvwxyz".

If you include upper case "ABCDEFGHIJKLMNOPQRSTUVWXYZ" then you just included 26 more bases.

Many humans like English. (Base26)
Computers like to read Binary using 1 or 0. (Base2)

Some humans read Japanese.
It is said that you need to know at least 3000 Japanese characters (Kanji, Katakana, Hiragana) to read a Japanese newspaper.
That's Base3000 and that's not even all of the Japanese characters!

When successfully communicating we have to change (convert, encode, translate, whatever) your message from one base to another base.

You are hoping that someone who reads the message will NOT be able to understand it.

For example the APT Malware had the call back - ''
The creator could have easily left the callback in English.

But instead, it was converted into Base64 to hide from those who easily read English.
It becomes necessary for us to recognize the code we are seeing and convert it into something we are better at reading.

    Base2   (Binary)             = 1 or 0
    Base4   (DNA)               = ATCG
    Base10 (Decimal)          = 0123456789
    Base16 (Hexadecimal)   = 0123456789abcdef
    Base64 = ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/=

Here we convert the word "MalwareViz" into other base options.

Base2 (Binary) = 1 or 0
    M              a              l                w               a              r              e               V              i     
    01001101 01100001 01101100 01110111 01100001 01110010 01100101 01010110 01101001

Base4 (DNA)  = ATCG
    M                    a                     l                     w                     a                       r                            

    e                       V                      i                   z

Base16 (hex) = 0123456789abcdef
    M  a   l    a    r   e   V    i   z
    4d 61 6c 77 61 72 65 56 69  7a

Base64 = ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/=

Base Japanese

Using Python:
  Base16 (hex)
    >>> 'MalwareViz'.encode('hex')
    >>> '4d616c7761726556697a'.decode('hex')

    >>> 'MalwareViz'.encode('base64')
    >>> 'TWFsd2FyZVZpeg==\n'.decode('base64')

Monday, May 26, 2014

What's Under the Hood

MalwareViz_0735b7781096c9de80ee1bd4619e5bbf Start FlashUpdate.exe VirusTotal VirusTotal Alerts=31 Start->VirusTotal

This file on looks interesting as somebody tagged it as Chinese APT.
The file did not run correctly as there was no network traffic and no created files.

The goal here is to find Network Traffic or Created Files.
Lets take a look "under the hood".

Start gathering information by just looking around.
Look at the "String" tab in under "Static Analysis".

Doing searches on the strings for "http" or "connect" can help, but does not help for this malware.
Lets drag and drop it into Immunity Debugger and look for some more English.
This is not debugging, we are just using a Debugger to look around.

Chinese APT

"English-ish" strings that stick out are in the far right column.
These can be Google searched to find their meaning.

Here is the part that we sound like Shawn Spencer from "Psych".
We are not really sure what's going on, but we "see" things.

We see something about Registry Query Values, Draw Icon, string lengths.
Wait I'm sensing something...

The string that sticks out is not the String Length "lstrlen" but the fact that it contains a Base64 string.
What? Why?
How do we know it is Base64 and what is Base64?

It is the equal signs at the end that gives it away.
The "=" or "==" is added to the end of a Base64 string if the characters it started with are not long enough to finish a Base64 encoding.

Decoding this string using Base64 and Python we see:
Python 2.7
>>> "d3d3Lmdvb2dsZS1ibG9nc3BvdC5jb206ODg4OA==".decode('base64')

This looks like Internet traffic!
But it looks kind of legitimate?
We recognize the words google and blogspot.

Just like biological viruses will try and trick your immune system that they belong there,
computer viruses also will try and hide by looking legitimate.

Red flags:
    1) Port 8888.
            This is not a common Internet port.
            Why isn't is using port 80 (http) or port 443 (https)?
    2) Creation date of this URL is 10-jan-2014.
            Why so recent?
            The real Google Blogspot or was created 22-jun-1999.
    3) Notice the Name Servers.
         An interesting side note is the name Xiaozhai_Tiankeng is apparently the worlds deepest sinkhole.
         It's found in China.

This looks like at least one callback.
We should not assume there is only one but it gives us something to go on.
We can use this to look through our Internet traffic to see how many machines in our network have tried to go to this site.

What about Created files?
We will take a look at that in another blog.

The graph can be updated to look like this:

MalwareViz_0735b7781096c9de80ee1bd4619e5bbf cluster1 Internet Traffic Start FlashUpdate.exe VirusTotal VirusTotal Alerts=31 Start->VirusTotal point1 VirusTotal->point1 point1->

Tuesday, May 20, 2014

Seeing Malware is believing Malware.

Link to

It's in everything.
It is the instructions of life.
It runs our bodies and our machines.

Yet most of us can't see it or read it.
Code is created to make our lives automated that we may be free.

There is also another code.
This code goes by many names.
This code is created to steal, destroy and cause pain.

It is malicious and it is everywhere.
Constantly testing our defenses to find where we are most vulnerable.

MalwareViz was created to see this code:
To visualize malicious software called "malware" to understand it.
To encourage all to open their eyes to see and their mind to read.