Static Malware Analysis — Part 4.1.1

FingerPrinting Malware — VirusTotal

CyberNotes
5 min readFeb 5, 2025

Disclaimer: Before conducting any malware analysis, ensure you have set up a safe analysis environment.

What is Static malware analysis?

Static malware analysis is the process of analysing malware without executing it. This is usually the first step in order to understand the basic functioning of the malware before deciding whether there is something to explore further/not.

Some of the processes you will conduct include:

  1. Hashing/Fingerprinting the malware
  2. Identifying the Target Architecture/File type
  3. Extracting Strings from the malware.

For Purposes of this Blog, I will dive deep into each Process with as much detail as I can with the knowledge i’ve gained through my study and research

Fingerprinting Malware

Fingerprints have long been used as a reliable method to uniquely identify individuals. Since no two people share the same fingerprint pattern, this distinct feature has been crucial in linking suspects to crimes and identifying otherwise unrecognizable individuals.

This is the same rationale that is used in Fingerprinting Malware. A malware sample has unique characteristics based on the contents of the file. Threat actors might change the name of the file, but as long as the contents are the same, the “malware fingerprint” remains the same.

p/s: Fingerprinting malware is also called Hashing Malware

So, How is this done?

This process uses a hash function or Algorithm to generate a unique “fingerprint” or hash value for a file . This unique value (hash) helps antivirus/detection systems quickly identify and classify the malicious files.

The most common hashing algorithms used include:

  • SHA256
  • MD5
  • SHA1

Now, lets look at how you calculate the hash of a malware file: In the terminal run:

sha256sum <insert file name here>

Say I have a malware sample named malware.exe

// To calculate the sha256 hash value
sha256sum malware.exe

>> 5be1737591232907690af4c206a5f16fe8a5157d49494d1ee6c73c30df8ecdcc malware.exe

// sha1 hash value
sha1sum malware.exe

>> cae3f91de0ea80af2090e5ec260901f1574fbcbb malware.exe

// md5 hash value
md5sum malware.exe

output: >> 3367ba3dd3b74c902e3c47b0688b9750 malware.exe

Each of the hashing algorithms gives us a hash value. Once you have this hash value, you will then use a tool to check whether it has been detected before.

One such tool is VirusTotal

VirusTotal takes the hash and scans it using multiple security tools at once and shows you if any of them detect it as malicious.

Alternatively, you can upload the malware file / paste the url of a link you are suspicious about

Once you have that hash value, then we will paste the hash value to virus total and press enter. Let’s try that.

This is the Current interface:

For this file, it shows that there are no matches found:

Virus total scan results.

Now, this can mean one of two things: — Either this is new malware and it does not exist in the virus total database (This is especially true with Zero day attacks) or The Malware Authors have used techniques such as hash manipulation or polymorphic malware.

As a result, it is important to note, that getting a “no match” from virus total does not mean that the file is not malicious. It just means your work got harder. You have to work with the mindset that malware authors are very smart and know the tools and techniques used to detect and analyze malware , and will do what they can to evade that!

In order to understand why that is the case, let’s see how Virus total works under the hood

Virus Total — Under the Hood

Virus total is essentially crowdsourced intelligence. It works by sharing data from millions of scanned files, hash values and urls to help users understand how the file was flagged over time. It will then output how many antivirus tools flagged it as suspicious/malware

The results will be based on historical data of the malware and heuristics(analyzing the behaviour). The heuristics will only work however if you actually upload the file itself.

Privacy Concerns with VirusTotal

VirusTotal is a great tool for basic analysis. However, submitting your malware sample to virus total means that it will potentially be shared with other third parties.

This is a problem especially for Bespoke malware that is targeted towards a certain organization. This type of malware may contain some sensitive information/proprietary information that you might not want to be out in the wild.

In this case, stick only to using the hash values, and not uploading the file itself.

Better Malware Fingerprinting Options

Traditional hashing algorithms like SHA256, SHA1, and MD5 are extremely sensitive to even the slightest change in a file. While they’re great for verifying file integrity, they fall short when it comes to malware fingerprinting. Even a minor modification to a malware sample can produce a completely different hash, making it difficult to track and classify related threats.

So, how can we better fingerprint malware?

The key lies in understanding that most malware samples are not entirely unique. They often share common characteristics and can be grouped into broader categories such as Trojans, Rootkits, Worms, and more.

Instead of relying on traditional hashing, it makes more sense to use robust hashing methods that can identify malware belonging to the same group or originating from the same threat actor.

Some of the more advanced and effective options include:

  • Fuzzy Hashing: Helps detect similar files by identifying partial matches between files rather than requiring a perfect match.
  • Import Hashing (ImpHash): Focuses on the order and types of imported libraries in an executable, which is often consistent across malware variants.
  • Section Hashing: Targets specific sections of a file, such as the code or data sections in PE files, rather than hashing the entire file.
  • Control Flow Graph Hashing: Analyzes the control flow of a program to create a hash that reflects its logical structure rather than its raw content.

In my next blog post, I’ll dive deeper into these advanced techniques:

  • Fuzzy Hashing — How it works and why it’s useful in detecting related malware samples.
  • Import Hashing (ImpHash) — A powerful tool for identifying malware families.
  • Section Hashing — Focusing on the most relevant parts of a file to improve detection accuracy.
  • Control Flow Graph Hashing — A robust method for understanding the structure and logic of malicious code.

Stay tuned for a detailed breakdown of how these techniques can help you strengthen your malware analysis and detection capabilities.

Let me know if you want to tweak the tone (more formal, more technical) or expand any section.

Thanks for reading!! 😃

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

CyberNotes
CyberNotes

Written by CyberNotes

Data Science/Cyber - Student at Michigan State University.

No responses yet

Write a response