When Malware is Packing Heat
Executable compression, aka “packing,” is a means of compressing an executable file and combining the compressed data with decompression code into a single executable.
Throughout the years, anti-malware vendors have educated their users about polymorphic malware. This kind of malware has mechanisms to “repackage” itself frequently (ideally every time it gets distributed to a victim) so those anti-malware solutions based on static signatures become useless.
Fundamentally, a packed program is a program that follows this pseudo-code:
E = […encoded malware …]
K = extract_key()
M = decode_malware(E, K)
address = load_in_memory(M)
As it is clear from the pseudo-code above, an encoded version of the malware is stored in a variable, possibly encoded with a key, K. At execution time, the program generates the key (if necessary) and then decodes the malware. The malware, M, is then loaded into memory at a specific address. Finally, the unpacker program jumps to the address and executes the malicious payload.
Note that this process can be repeated multiple times, by extracting additional portions of packed code during the lifetime of a process, sometimes with nested packing (i.e., unpacked code that unpacks more code).
This type of behavior has been very common in malware for a number of years. For this reason, unpacking emulators were introduced by anti-virus vendors. These emulators perform the initial operations required to unpack the actual program code and then perform their static analysis approaches on the unpacked code.
Cybercriminals soon took notice of packing emulators and started introducing anti-emulator mechanisms, such as the use of multiple processes during the unpacking process. These approaches made necessary the use of full-blown sandboxes for the analysis: only by running the actual program in a realistic environment was it possible to extract the actual behavior of the code.
In another step in the never-ending war between good and evil, cybercriminals have started introducing anti-sandbox mechanisms into their packers. For examples, there are packers that look at the values returned by the process affinity API (which returns the number of available cores in a system) in order to determine if they are running in a sandbox or in a real, bare-metal system.
The increasing use of sophisticated anti-analysis techniques in packers suggests a natural question: why not detect malware by detecting packers?
One could decide to simply block executables that appear to be packed, forcing the malware writers to resort to more subtle (and expensive) mechanisms to avoid detection.
Well, the problem is that a substantial portion of benign software is packed as well.
We ran an experiment over a dataset of recently observed binaries, and we found that ~37% (742/1990) of malware had some form of packing (that is, it had a mechanisms similar to the one shown in the pseudo-code above), and ~6% (373/6122) of benign software had packing behavior.
Note that the packing behavior was observed during execution, and, therefore, is independent of specific packers or other techniques.
This shows that rejecting a program just because it’s packed is not an effective malware defense strategy.
So what next?
Digital Signatures – Even though an invalid or missing signature combined with unpacking behavior seems promising given that 97 percent of our malicious samples shared this characteristic, there are many benign samples (40 percent) that also have this characteristic. Therefore, using this as the only signal would result in a large amount of false positives.
How Executables are Packed – The other route to discriminating between malicious and benign samples is to look at how the executables are packed. In fact, not all packers are created equal. Some packers, such as UPX, are commonly used by both benign and malicious binaries; these packers cannot be used to determine if an executable is malicious or not. However, there are other packers (usually ad hoc programs) that use a number of techniques to prevent reverse engineering. For example, they use multiple levels of packing – that is, the unpacked executable is actually another packed program, a sort of “turducken of malware” – or they employ sophisticated anti-debugging techniques. An interesting and very complete analysis of packing behavior was published in 2015 in the IEEE Symposium on Security and Privacy: SoK: Deep packer inspection: A longitudinal study of the complexity of run-time packers.
However, complex packing cannot be used as the sole signal for malware detection as there are many situations in which developers want to prevent the reverse engineering of their program so they resolve to use several levels of packing.
Compressing Packers vs Encrypting Packers
Another interesting differentiation is between compressing packers and encrypting packers.
Compressing packers try to reduce the size of the original program using compression techniques. As a result, the compressed data can still retain some of the statistical properties of the original program. Encrypting packers, instead, perform full encryption of the program, and consequently, the encrypted data tends to be more “random” (more formally, it has a higher entropy). Unfortunately, once again, one cannot use this information to detect if a packed executable is malicious or not, as encrypting packers are used by developers of benign applications on a regular basis.
Even though we have claimed so far that using information about packing is not a suitable approach for effective detection, a question remains: Is the industry using packing as a signal?
A study conducted in 2013 by researchers at the University of California in Santa Barbara took almost 8,000 system files from various versions of the Windows operating system and uploaded them to VirusTotal, obtaining an unsurprising “all OK” from the anti-malware tools included on the web site.
Then, the same files were encrypted using four packers (UPX, Upack, NsPack, and BEP), resulting in 16K verified samples (some of the packed files did not appear to be functional and had to be eliminated from the data set). These samples were then submitted to VirusTotal again, and the results, this time, were surprising: while the samples packed with UPX were not flagged as malicious, 96.7% of the samples packed with the remaining three packers were labeled as malicious by more than 10 anti-virus products.
The results clearly show that many antivirus tools use the identification of packing behavior as a signal for classification as malware, but this was four years ago.
In order to verify the state of art today, we reproduce, on a smaller scale, the 2013 experiment. We took 10 benign samples and we have packed them with Obsidium, a commercial packer tool, and then we submitted the samples to VirusTotal.
First of all, an important disclaimer: the engines on VirusTotal are not configured in the most effective way, and, therefore, the results must be taken with a grain of salt. For this reason, we do not single out any specific vendor, and instead, we show only the aggregate results.
The results clearly show that packing is still used as a signal, as many vendors (including top players in the AV industry) have identified benign programs as malicious, only because they were packed.
In addition, the results show that packing can be an effective way to avoid detection, as some of the vendors are unable to identify the malware samples as such.
The lesson learned is that packers are not a reliable way to determine the nature of an executable. Instead, it is necessary to run the sample, trigger the unpacking, observe how the unpacking is performed, and combine this information with the actual behavior of the program.
Of course, this requires more resource than a simple static analysis, but, nowadays, it’s either that or risk missing malware or inundating security teams with false positives.
Latest posts by Giovanni Vigna (see all)
- How Cybercriminals are Attacking Machine Learning - January 4, 2018
- The 2018 Cyberthreat Landscape—Predictions and Trends - November 16, 2017
- From Trapping to Hunting: Intelligently Analyzing Anomalies to Detect Network Compromises - October 17, 2017