I Hash You: A Simple But Effective Trick to Evade Dynamic Analysis

I Hash You: A Simple But Effective Trick to Evade Dynamic Analysis

Authored by: Alexander Sevtsov
Edited by: Stefano Ortolani

Apparently “Every new day is new evasion trick day” is a valid motto for many malware authors nowadays. The last sample we are adding to our collection is a banking malware that tries to evade analysis by carefully checking its own filename. While our backend strictly preserves the original name, knowing the tricks employed by this malware might be essential while threat hunting or during some IR investigation.

The sample in question sha1: 4793f245ee6f04f836071528f9f66d3a9a678341 is a variant of the highly evasive banker called Gootkit and will be the subject of this blog post.

Delivery Vector – Social Engineering

Based on our internal telemetry data, the main delivery vector for this malware are document files, mainly macro-based downloaders and documents with embedded PE files. No known exploits are involved in this attack. Below you can see the screenshots (Figure 1 and 2, click on each to enlarge) showing two examples of malicious documents along with the usual lure encouraging victims to either enable macro code execution or double-click the embedded executable file.

Click to enlarge

Figure 1. An example of malicious macro-based document downloader

Click to enlarge

Figure 2. An example of a malicious document with an embedded executable file

Evasion Tricks – Checking Own Filename

Once the user falls for either suggestion, the malware starts executing. The first evasion trick is done by checking its own filename: to retrieve it, the malware calls the PathFindFileName function. It then computes the hash of the filename by using the following algorithm:

Figure 3. Decompiled hash algorithm

Figure 3. Decompiled hash algorithm

The constant highlighted in Figure 3 quickly gives away that we are dealing with a standard CRC32 hashing algorithm with no table lookup (for more details refer to crc32b here). The hash so-computed is then matched against the entries of a blacklist (see Table 1). This is clearly to frustrate analysts: as there is no actual string comparison, it is not possible to easily extract the actual entries. The only option is to re-implement the hash algorithm and either brute-force it or rely on a dictionary.

Blacklisted Hash Filename
0xBC136B46 SAMPLE.EXE
0xEED889C4 MALWARE.EXE
0x58636143 TEST.EXE
0xD84A20AC SANDBOX.EXE
0xC0F26006 BOT.EXE
0xE8CBAB78 MYAPP.EXE
0x8606BEDD KLAVME.EXE
0x2AB6E04A TESTAPP.EXE
0x31E6D1EA ?

Table 1. Blacklisted CRC32 hashes and their input strings (not really unique as you can see here for example)

We tried both approaches and managed to recover the original filenames. This helped us to get an idea of the analysis systems targeted by this evasion trick. As it turns out, not only sandboxes are in the crosshair of this sample, but also manual analysis sessions: renaming the analyzed file is quite a common practice among researchers in training who just started working with instrumented environments. To them we say: filenames matter, and should at least be randomized.

Know that this behavior is not unique. In fact, we found several samples in our intelligence backend that make use of similar names as a part of their anti-analysis tricks (see Figure 4).

Figure 4. Another sample (sha1: b8ced67968a10c931aa7da5630baaf1497a7ceb6) that checks weak filenames by calling the GetModuleFileName function

Figure 4. Another sample sha1: b8ced67968a10c931aa7da5630baaf1497a7ceb6 that checks weak filenames by calling the GetModuleFileName function

Unfortunately, none of them helped us with figuring out what was the last blacklisted entry. Brute forcing only produced a bunch of collisions DNYYQXU.EXE and BLRARSHA.EXE and they are both just too random to be the actual blacklisted file names.

Sub-optimal Approaches

Analysts might be tempted to think that randomizing file names would do the trick. That would normally work, but remember that you would still be exposed to the tiny chance of guessing a collision (for example MYAPP.EXE and BBNURKU.EXE have the same hash 0xE8CBAB78 this is hardly common for hash algorithms, but in case of error-detecting codes, it happens a tiny bit more often.

Another common approach is to rename a file to its md5 or sha1 value (such as 02d41d2a7b50b7ee561eef220a7b57df.exe. This is definitely a bad practice, and much worse than a fully randomized name since there is nothing stopping the malware from doing the very same computation and simply evading analysis in case of a match. Interestingly, this is also the default filename when downloading artifacts from platforms like VirusTotal, isn’t it?

Going back to the Gootkit sample, in our case, the malware is a bit more aggressive and just terminates execution if the file name is at least 32 digits long (which is incidentally the length of the md5 hash algorithms). For this reason, to our customers manually submitting artifacts, our recommendation is to always avoid long (maybe hash-based) file names, as this has the potential to hinder a correct dynamic analysis.

Lastline Approach

The best way to mitigate these evasion attempts is to rely on the origin of the sample, and if possible, retrieve the original filename. We understand it might not be always that easy if done manually. For example, the file name might originate from a malicious downloader or from an embedded resource.

In our sandbox, we automate this step and we monitor the real network traffic and execute the artifacts as they are downloaded, meaning that we always preserve the original name (see Figure 5).

Figure 5. The Lastline analysis overview of the Gootkit malware Conclusion

Figure 5. The Lastline analysis overview of the Gootkit malware

Conclusion

Sometimes the simplest tricks are the best, and can even frustrate analysts (imagine if the sample was executing only if the original name was preserved, for example). In this article, we went through some simple examples and explained how to avoid the most common pitfalls. In conclusion just be careful when submitting a file manually to a dynamic analysis system, and if in doubt, let the system unpack or download the artifact you want to analyze.

Alexander Sevtsov

Alexander Sevtsov

Alexander Sevtsov is a Malware Reverse Engineer at Lastline. Prior to joining Lastline, he worked for Kaspersky Lab, Avira and Huawei, focusing on different methods of automatic malware detection. His research interests are modern evasion techniques and deep document analysis.
Alexander Sevtsov