Party like it’s 1999: Comeback of VBA Malware Downloaders [Part 3]

Party like it’s 1999: Comeback of VBA Malware Downloaders [Part 3]

Authored by:  Clemens KolbitschAlexander Sevtsov, and Arunpreet Singh

Find more details on this series in Part 1 and Part 2.

Evasive scripts continue to be on the rise – whether it’s in the form of VBA macros in Microsoft Office documents or in the form of JScript scripts, malware authors are equipping their campaigns with a wide arsenal of tricks to avoid detection through security solutions.

In the first two parts of this blog series, we described how, in a recent wave of VBA-downloader based attacks, malware authors statically obfuscate their code to hinder detection and use fingerprinting to bypass dynamic analysis. In this third part, we focus on a somewhat different attack to circumvent detection: limited coverage of security solutions on the different parts of the attack chain.

As we will show in this post, attackers abuse the fact that a security solution may only see part of the attack. For example: the malware may come in the form of an email attachment, but the analysis can only be done completely by including in the analysis the entire email – that is, including the email body and how humans behave when receiving the email. Another problem is that a user may have applications installed that treat certain file types differently from the way in which a security solution handles them.

Social Engineering and Abusing Password-Protected Files

An interesting part of this series of attacks is the social engineering component: since Microsoft protects users by disabling the execution of macros in Microsoft Office by default, the attacker has to trick the victim into enabling macros.

Typically, the attacker does this by claiming that the file cannot be viewed without enabling macros

or that the encoding needs to be fixed (by running a macro) to properly display the content:

Recently, the attackers have taken the social engineering attack even a step further: they use the password-protection feature of Microsoft Office documents, and embed the required password in the email body to which the file is attached. When opening such a document, the user is prompted to enter the password, and Office will not launch any macro code until the user has done so.

Because of this, a security solution that does not know the password (for example, because it has limited visibility into the original email containing the password) may not be able to trigger the execution of the macro and thus incorrectly assume that the file is harmless:

Targeting Specific Software Versions: Microsoft Office Publisher

One wave of attacks we have observed was using malicious macros embedded in Microsoft Office Publisher (PUB) files. At first, this came as a surprise, because we did not expect this software to be widespread enough to be of interest to attackers.

Interestingly however, it turns out that specific versions of Microsoft Office are able to accept this file format. More specifically, starting with Small Office (and all the way to the Enterprise version) Microsoft Office successfully opens and runs any macro code embedded in PUB files. As a result, it is important that a sandbox is able to support even these less-frequently used file formats:

File Type Confusion Attacks

One key part to providing proper dynamic analysis of files is to correctly identify the type of a file. When the file type is not detected correctly, the file may be launched in a way that does not trigger proper display or execution of embedded code. We have seen various campaigns where malware samples are crafted in a way that may mislead the type detection logic in different ways.

Document Templates with Embedded Macros (DOTM)

For some files, we saw the RTF extension being used in the wild, but these files were actually document templates with embedded macro (DOTM) files. Technically, the dotm file format is much closer to the docm files – both are ZIP containers using XML files for storing settings, and they merely use slightly different ContentType strings – and it is very easy to confuse the two, and a security solution may hence treat them equally:

DOTM
ContentType=”application/vnd.ms-word.template.macroEnabledTemplate.main+xml”

DOCM
ContentType=”application/vnd.ms-word.document.macroEnabled.main+xml” 

This kind of file type confusion may break analysis, because the embedded VBA code is not launched if the file extension is incorrect: the macro is only executed properly if the file has one of the doc, rtf, dot, or dotm extensions.

The analysis overview in the Lastline sandbox shows that the file type was identified properly. Thus, the VBA code is executed, which, in turn, allows one to see an HTTP request to the C&C server, followed by the download (and subsequent execution) of the payload:

At the time of this specific analysis, the C&C server was not available, so the analysis overview shows that the link is unreachable (which happens frequently for such kind of downloaders):

Web Page Archive Format (MHT)

Another trick, discovered recently, was the usage of the web archive (.mht) file format, which is a container combining images, text, and objects from the original Word document into a single file. Malware authors create a document with macros, export it as MHTML, and rename it from its default extension .mht to .doc.

Since the file format is quite different from a traditional Microsoft Office document, a security solution may misidentify the file type and perform the incorrect type of analysis. Office, on the other hand, opens and executes the embedded macro regardless of the modified file extension.

The following screenshot shows the analysis of an example of such a malicious document file:

The analysis overview shows that a new batch script is dropped and executed, which, in turn, downloads and launches a PE file:

Readers with access to the detailed analysis report in the Lastline portal may notice another interesting point: since the PE to download and execute is unavailable, the attacker’s server returns an HTML page instead of an executable. As a result, the script tries to launch a non-executable file with .exe extension, which Windows tries to emulate via ntvdm.exe (which inevitably fails):

Pushing Parser Limits via MIME-Encoded Documents

Another file format used in many attacks earlier this year is the MIME-encoded document type. Similar to the web page archive format described above, this file type embeds multiple parts of a document in a single text file (and uses MIME-encoding, as the name suggests).

To identify potentially malicious content in such a file, a security solution must be able to correctly recognize each individual embedded file stream (for example to identify if detailed analysis in the sandbox is required). Unfortunately, the parsers used in Microsoft Office allows attackers a wide degree of freedom to diverge from the standard for MIME-encoding without losing the ability to open the document.

Thus, in this attack scenario, a security solution needs to not only mimic a realistic user environment and simulate user behavior, but also replicate the quirks of Microsoft Office to ignore any noise the attacker may have included in the MIME stream. The following example shows such noise:

The malicious part of the document is the ActiveMime stream (named xdxFCGVHBJ.mso). If this part is extracted according to the standard, the ActiveMime header contains binary garbage, meaning that the file type cannot be detected correctly:

Note that there are two types of noise in this example: binary content hides the actual ActiveMime header, and the stream is missing the double-newline (rnrn) most parsers would expect for marking the start of the stream content. Only by detecting and skipping this noise (as Microsoft Office seems to do) can we extract this stream correctly:

Summary

Mimicking a real user environment as realistically as possible is becoming a key necessity to correctly analyze, and in turn detect, the waves of VBA downloaders we have seen in recent months. This has many facets and ranges from providing the same software in the analysis sandbox as users have installed in their environment, all the way to having insights into the entire attack chain.

As we have shown in this post and its two previous parts in the blog series, Lastline is able to provide a realistic environment in which the malicious documents are opened, and it is able to react to social engineering attacks against the user in a way that tricks the malicious code into revealing its true nature. As a result, the Lastline sandbox is able to recognize the malicious behavior and classify these attacks successfully.

Authored by:  Clemens KolbitschAlexander Sevtsov, and Arunpreet Singh

Clemens Kolbitsch

Clemens Kolbitsch

Clemens is a Security Researcher and engine developer at Lastline. As a lead-developer of Anubis, he has gained profound expertise in analyzing current, malicious code found in the wild. He has observed various trends in the malware community and successfully published peer-reviewed research papers. In the past, he also investigated offensive technologies presenting results at conferences such as BlackHat.
Clemens Kolbitsch