Reverse Engineering Malware — A Look at How the Process Has Evolved

Reverse Engineering Malware — A Look at How the Process Has Evolved

Reverse engineering Python FIIn 1971 the first known instance of a self-replicating “computer worm” was recorded. Creeper didn’t harm computers, but it did propagate through computers over the internet predecessor, ARPANET. By investigating how Creeper worked, a contrasting program — Reaper — was created to stop its spread. Reverse engineering has long been the leading method for understanding how malicious programs operate and what they’re engineered to do. Reverse engineering as a process has evolved as malware has become more sophisticated and detection tools have improved, but it remains critical.

Overview of the Reverse Engineering Process

Reverse engineering malware involves disassembling (and sometimes decompiling) a software program. Through this process, binary instructions are converted to code mnemonics (or higher level constructs) so that engineers can look at what the program does and what systems it impacts. Only by knowing its details are engineers then able to create solutions that can mitigate the program’s intended malicious effects. A reverse engineer (aka “reverser”) will use a range of tools to find out how a program is propagating through a system and what it is engineered to do. And in doing so, the reverser would then know which vulnerabilities the program was intending to exploit.

Reverse engineers are able to extract hints revealing when a program was created (although malware authors are known to leave behind fake trails), what embedded resources they may be using, encryption keys, and other file, header, and metadata details. When WannaCry was reverse engineered, attempts to find a way to track its spreading led to discovering what is today known to be its “kill switch” – a fact that proved to be incredibly important to stop its spread.

In order to reverse malware code, engineers will often use many tools. Below a small selection of the most important ones:

  • Disassemblers (e.g. IDA Pro). A disassembler will take apart an application to produce assembly code. Decompilers also are available for converting binary code into native code, although they’re not available for all architectures.
  • Debuggers (e.g. x64dbg, Windbg, GDB). Reversers use debuggers to manipulate the execution of a program in order to gain insights into what it is doing when it is running. They also let the engineer control certain aspects of the program while it is running, such as areas of the program’s memory. This allows for more insight into what the program is doing and how it is impacting a system or network.
  • PE Viewers (e.g. CFF Explorer, PE Explorer). PE (for Windows Portable Executable file format) viewers extract important information from executables to provide dependency viewing for example.
  • Network Analyzers (e.g. Wireshark). Network analyzers tell an engineer how a program is interacting with other machines, including what connections the program is making and what data it is attempting to send.

The Challenge with Reverse Engineering Malware Today

As malicious programs become more complex, it becomes increasingly likely that the disassembler fails somehow, or the decompiler produces obfuscated code. So, reversers need more time to understand the disassembled or decompiled code. And this is time during which the malware may be wreaking havoc on a network. Because of this, there has been an increasing focus on dynamic malware analysis. Dynamic malware analysis relies on a closed system (known as a sandbox), to launch the malicious program in a secure environment and simply watch to see what it does.

There are a lot of benefits to using a sandbox for dynamic analysis, but some downsides as well. For example, many of the more sophisticated malicious programs use evasion techniques to detect that they are in a sandbox.

When a sandbox is detected, the malware will refrain from demonstrating its true malicious nature. Advanced malware programs have a suite of tools they use to outsmart sandboxes and evade detection: they can delay their malicious activities, only act when a user is active, hide malicious code in areas where it will not be detected, along with a variety of other evasion techniques.

This means that reverse engineers cannot rely solely on dynamic techniques. At the same time, reverse engineering every new malware threat is unrealistic.

The Changing Role of Reverse Engineers

By using dynamic analysis to automate as much of the malware analysis as possible, cybersecurity experts can mitigate advanced malware faster and more effectively, freeing their time for the really difficult work, such as understanding new encryption schemes, reverse communication protocols, or working on attribution. The more advanced the automated solution is, the more likely a reverser will not have to go back to the initial (and time consuming) phase of the process, which is unpacking, deobfuscating, and understanding the malware main behaviors.

Cybersecurity teams need in fact to implement a two-pronged approach where sandbox technologies are used to automatically analyze the vast majority of threats, and reversers dedicate their time to surgically analyze the internals of the most sophisticated ones when further threat intelligence is sought.

Stefano Ortolani

Stefano Ortolani

Stefano Ortolani is Head of Threat Intelligence at Lastline. Priot to that he was part of the research team in Kaspersky Lab in charge of fostering operations with CERTs, governments, universities, and law enforcement agencies. Before that he earned his Ph.D. in Computer Science from the VU University Amsterdam.
Stefano Ortolani