Detecting Keyloggers on Dynamic Analysis Systems
One notorious functionality present in many variants of today’s advanced malware is the ability to steal sensitive user information. Taking control of a targeted machine, an adversary has basically unlimited abilities to secretly monitor the actions performed by an unsuspecting victim who uses the infected machine. The type of data stored on a typical machine, and to which the attacker has access to, ranges from user account credentials (such as usernames and passwords), to financial data (such as credit card numbers or transaction secrets), and even personal data (such as social security numbers).
Very often, malware is specialized in capturing and identifying different types of information typed by the victim, allowing the collection of very specific information in which the attacker is interested. Typical examples are information entered in a login form for a specific URL, or values that resemble social security numbers (SSN) or credit card numbers.
Automatic identification of this type of attack using traditional dynamic analysis (inside a sandbox) is tricky, as most activity requires a user to trigger the attack. Such a trigger could be using username and password to authenticate to a service, or entering user-sensitive information into an application or website.
In this post, we describe some of the ways attackers manage to collect sensitive information on an infected machine and how Lastline’s high-resolution dynamic analysis engine is able to trigger and detect this type of malware.
How Keyloggers Capture Data
When looking at keyloggers, we can typically distinguish three basic types: User-mode keyloggers, kernel-mode keyloggers and hardware-based keyloggers. In this post, we will focus on the first two types – software-based keyloggers. Hardware-based keyloggers work by intercepting data sent from external devices (such as keyboard or mouse) to the computer hardware, and are thus outside the reach of most remote attackers.
Overview of keylogger methods
In user-mode keyloggers, a very common approach used to steal information typed by the user is through the use of the windows API SetWindowsHook. This API can be used to intercept events from the system, such as keyboard and mouse activity. When the to-be-intercepted action is triggered, a function of the attacker’s choosing is executed. Another user-mode method for capturing keystrokes that we found in many malware variants consists of continuously checking the system’s keyboard state using the GetAsyncKeyState or GetKeyState API functions. Different from the first method, which notifies you at every keyboard event, the attacker here needs to actively monitor which keys are pressed.
Kernel-mode keyloggers are more powerful than their user-mode counterparts, as they work with higher privileges, but are inherently more complex to implement. This type of keylogger uses filter drivers to intercept keystrokes received from the keyboard or modify internal Windows kernel structures in order to capture input data. The complexity and mostly-undocumented nature of kernel code can lead to malfunction of the system if a sample is executed on an unsupported system, making user-mode keyloggers a more prominent approach.
Analyzing Keylogger Software
As mentioned earlier, when executing malware with keylogging abilities on a traditional analysis system, this functionality typically remains unnoticed. This is because, unless a user triggers specific actions, the malware will not be able to capture anything during the analysis.
To reveal this type of behavior during the automated analysis of a sample, the system needs to mimic user interactions such as keyboard typing and mouse movement. Interestingly, this is far from simple since some malware families only capture data when the key is pressed within a specific application or window, such as Internet Explorer or Firefox, for example.
Further, the analysis framework must be able to identify the type of application and/or information that is relevant for the malware under analysis. For instance, if a keylogger only targets information on financial transactions (such as in a breach of Point-of-Sale terminals late last year), injecting other user information would not reveal any interesting behavior. If the sandbox fails to identify the data in which the attacker is interested, the keylogger remains inactive, disguising itself from the analysis system.
High-Resolution Analysis of Keyloggers
The Lastline high-resolution dynamic analysis engine has the ability to simulate specific user interactions during the analysis of a sample. Additionally, the system identifies the type of data that the malware sample is searching for and can simulate user behavior accordingly. For example, the system instruments a virtual user to use fake credit card information to do financial transactions, post user credentials to various services, or use email addresses in a way that might attract the attacker’s attention.
Once the keylogger starts collecting the injected data, the system tracks how the malware under analysis uses the stolen information, which gives vital information on the attacker’s goals. For example, the system tracks if the stolen data is written to the file-system or is sent to the attacker using a command-and-control channel.
Chewbacca analysis report
Chasing Chewbacca with Lastline
Chewbacca  is a trojan used in a Point-of-Sale malware operation to log keyboard data of certain systems . Besides capturing keyboard activity, the malware can also scan the memory of other processes for credit card numbers using regular expressions. Clearly, all of this happens without any signs of infections visible to the user.
Windows desktop showing no signs of the keylogger
When executed, the malware creates a copy of itself named spoolsv.exe under the Windows Startup folder. After that, it will start capturing all keyboard events from the system, logging them to a file named system.log.
Chewbacca data collection hook installation
The above figure shows the code used by Chewbacca to install the hook procedure on the system after its execution. We can see that the malware uses the Windows API SetWindowsHookEx to install the hook, with hook type WH_KEYBOARD_LL, which is used to capture low-level keyboard input. Also, to capture all keystrokes from the system, the malware sets parameter dwThreadId=0, which will install the hook on all existing threads running on the same desktop.
During its execution, whenever a keypress event happens, the callback KEYLOG_$$_HOOKPROC$LONGINT$LONGINT$LONGINT$$LONGINT will record the keys pressed along with the window title on which the keypress was recorded. Additionally, the software records when the window focus changes, logging the captured information to system.log.
The Lastline analysis engine gives access to files generated and network traffic captured during the analysis
Captured data shown in the analysis report
When analyzing Chewbacca inside Lastline’s high-resolution analysis engine, the system identifies the keylogging functionality, clearly marking it in the report overview. Additionally, the report contains detailed examples of data the malware sample extracted, as can be seen in the figure above.
Text content of the data captured by the keylogger into file C:Users…Tempsystem.log
Raw file content of the data captured by the keylogger into file C:Users…Tempsystem.log
As one can see, the report even shows examples of the sensitive data that was targeted by the attacker, which, in this case, includes credit card information and user passwords.
Keylogging functionality in advanced malware is a severe threat to user data, but traditional analysis systems are often blind to this vector of attack. Lastline’s high-resolution analysis engine tackles this threat in two ways: First, the analysis engine identifies that a keylogger was installed on the victim machine and identifies the type of information the attacker is targeting. Second, the system uses the collected information to instrument a virtual user to inject specific data to trigger the keylogger’s functionality. The injected data is then tracked throughout the analysis system to monitor how the malware processes (and potentially leaks) the stolen information.
Latest posts by Clemens Kolbitsch (see all)
- Ransomware: Too Overt to Hide [Part 2] - April 13, 2017
- Party like it’s 1999: Comeback of VBA Malware Downloaders [Part 3] - November 10, 2016
- Lifting the Seams of the Shifu “Patchwork” Malware - September 4, 2015