Detecting the Increased Threat of Android-based Malware

Detecting the Increased Threat of Android-based Malware

Android-based MalwareWith the significant growth of the Android operating system, cybercriminals are increasingly using the platform for malicious purposes, and organizations can no longer ignore these threats. This post describes the most effective techniques for detecting Android-based malware, and thwarting it. 

In 2017, Android overtook Microsoft Windows as the most popular operating system for getting online. According to Google, there are now more than 2 billion Android devices in the world. The tremendous growth, and now dominance of Android, hasn’t escaped the attention of malware authors. After several years of relatively minimal activity, we are beginning to see more and more Android-based malware in the wild, in our own labs and as evidenced by numerous recent publications by Forbes, SC Magazine, ZDNet, and others. In the past, most criminals wrote malware for Windows environments. While we can expect Windows-based malware to remain dominant for some time, we can no longer ignore malware on other platforms—Android in particular.

Advanced Static Analysis Efficiently Identifies Android-Based Malware

In Windows and MacOS X environments, most applications are delivered in binary format. Unfortunately, it’s quite difficult to identify malicious capabilities in binary code just by examining the code itself, especially if it’s been packed or obfuscated. It’s often necessary to evaluate Windows and MacOS X applications using dynamic analysis. This requires executing each program in some sort of virtual machine or sandbox and detecting malicious behaviors such as connections to a suspicious site on the Internet, or modifying the system’s boot-up process. This dynamic analysis approach, although effective, introduces latency and complexities.

Most Android applications are delivered in bytecode instead of binary format. From a security and malware detection viewpoint, this is good news. It’s much harder for malware authors to obfuscate bytecode, and it’s a lot easier for security tools to interpret it. For most Android apps, malware detection technologies can use static analysis to identify harmful code. This means that security tools can examine the bytecode directly to locate malicious capabilities. Static analysis does not require the code to actually run. So, in those situations where malware detection tools can use static analysis, the detection process is faster.

Although security tools can rapidly perform static analysis, to be effective, the system must use advanced technologies that are constantly updated. Apps can’t be evaluated in isolation because they might use other malicious components. The system needs to identify and track data flows, and the tools must carefully appraise triggers to prevent a high number of false positives. Let’s take a closer look at some of these important characteristics.

Tracking Data Flow

To understand what Android apps are doing and effectively detect malicious capabilities, it’s important to track the flow of data. Efficiently doing so is a core part of Android analysis and malware detection. Tracking data flow means understanding how and where data, let’s say an address book, is copied, stored, and transmitted. We want to know how the app accesses and processes the data, and where that data ultimately ends up. Is the address book sent to a known malicious site on the internet? Is the picture you just took with your camera accessed in a suspicious manner? Is data stored by your Amazon or other ecommerce apps accessed or transmitted? Tracking data flow is very important.

Advanced, static analysis is very good at tracking data flows. A good static analysis system can analyze the bytecode, and determine all of the data sets the app can access. It can also tell if the app is storing the data on the platform’s file system, or if it’s transmitted over the network, or simply presented to the user via the display. If the system identifies suspicious data access, alteration, or transmission, it can take appropriate action such as performing additional tests or raising an alert.

Identifying Suspicious Data Values

Tracking data flow is critical. But it’s also necessary to identify, understand, and test specific data values. For instance, if data is transmitted over the Internet, it’s important to know the destination domains. We also want to know if the attacker can select the domain dynamically via a configuration file or some other method. By evaluating the program’s data values, and how it uses those values in the app’s decision-making process, the malware detection engine will learn the app’s capabilities and potential behaviors.

Of course, it’s not just malicious apps that transmit data over the network. Many apps will send their version data to an upgrade server, or transmit user input to a web application. To effectively detect and distinguish malicious activity in an Android environment, the security system must gather a comprehensive picture of all the data flows and values, and test that information data against known, malicious types of behavior.

Android Malware Triggers

Tracking data values and flows will capture many important factors when it comes to detecting malware, but it’s also very important to find and analyze what we call malware triggers. Many types of malware will wait for some event to occur, and when it does, it triggers the malicious behavior. Some malware will wait for a specific date or time, and some will trigger when the device is in a certain location. More commonly, communication with a command and control server will trigger the malicious app. Sometimes malware will wait until a specific domain goes live. Occasionally the receipt of a specific SMS message triggers the malware.

Finding and understanding triggers within the app’s bytecode is key in determining if an app is malicious or not. But it’s necessary to distinguish between legitimate triggers and malicious ones. As mentioned previously, some legitimate apps take action based on the content of an SMS message. Android phone owners have all experienced a popup telling them that they just received a text message. However, if the malware detection system discovers bytecode that attempts to root a phone (obtain privileged access to the operating system code) when it receives a specific SMS text message, the app is clearly malicious. Unfortunately, the real world is not so cut and dry, and security tools require a great deal of intelligence and advanced analytics to accurately distinguish between legitimate and malicious triggers.

Accurately Detecting Malicious Triggers

Over the course of many years, Lastline created an advanced set of heuristics that we use to test both legitimate and malicious apps. The heuristics analyze bytecode and the characteristics associated with the triggers we find. One of the key things we’ve learned is that malicious triggers tend to use data values that are very narrow or specific. Examples include an app that performs an action when a very specific domain like “evil.com” suddenly appears, or executes an action when the current date and time is greater than “9:00am EST on Sept 11 2018”. We also discovered that the more complex a trigger is, the more likely it is that the app will be malicious.

But even very specific values and complex decision making doesn’t necessarily indicate malicious intent. For example, a large number of legitimate apps use time, location, and SMS text messages as triggers—and some of the values are quite specific. Malware tools that focus on these characteristics alone will generate an unacceptable number of false positives. However, by carefully evaluating the triggers found in tens of thousands of Android apps, Lastline has developed algorithms that very accurately score the risks—detecting virtually all malicious apps while maintaining a ratio of very few false positives.

Lastline evaluated 10,000 legitimate Google marketplace apps that contained triggers that other malware tools might incorrectly classify as malicious. Among this potentially suspicious looking set of apps, Lastline’s analytics generated very few false positives. We discovered that around 5,000 apps use time, 3,400 use location, and a little over 1,000 use SMS. So there definitely are a lot of apps that use these types of information. However, as shown in the chart below, a much smaller number of these apps use this information to perform checks of any kind, and an even smaller number perform checks that are overly complex or based on very narrow, specific values. In the end, only a few apps have suspicious looking checks that guard potentially malicious behavior. Only these few would potentially be misclassified as malicious when in fact they were not.

So, out of 10,000 legitimate apps, our tests identified as malicious only 10 apps that use time, 8 that use location, and 17 that use SMS messages would incorrectly be viewed as malicious. That’s an extremely low false positive rate—0.35% to be precise. This data shows that given the right static analysis and heuristics, normal apps will very rarely produce a false positive.

Static Analysis of 10,000 Google apps

Lastline performed even better when it comes to detecting apps that actually are malicious. When our software was tested against known malicious apps, the system identified every one of them as malware—a perfect score of 100 percent. See the NSS Labs Breach Detection Group Test for additional information about how effectively Lastline detects malware, as tested by an independent testing organization, NSS Labs.

Organizations Must Begin Now to Earnestly Combat Android-Based Malware

During the last several years, the dramatic increase in cybercrime has kept nearly all organizations busy adding various security tools. Until now, Android-based malware wasn’t very high on the list, and most organizations frankly ignored it completely. But its recent growth and prevalence, especially in light of the number of personal devices being used in corporate environments, has reached the point where organizations can no longer afford to ignore it.

Fortunately, the latest properly designed malware detection systems are very effective at detecting and thwarting Android-based malware. Those enterprises that begin now to deploy these tools will be better prepared to detect and defend against the attacks that are sure to come.

Dr. Christopher Kruegel

Dr. Christopher Kruegel

Currently on leave from his position as Professor of Computer Science at UC Santa Barbara, Christopher Kruegel’s research interests focus on computer and communications security, with an emphasis on malware analysis and detection, web security, and intrusion detection. Christopher previously served on the faculty of the Technical University Vienna, Austria. He has published more than 100 peer-reviewed papers in top computer security conferences and has been the recipient of the NSF CAREER Award, MIT Technology Review TR35 Award for young innovators, IBM Faculty Award, and several best paper awards. He regularly serves on program committees of leading computer security conferences. Christopher was the Program Committee Chair of the Usenix Workshop on Large Scale Exploits and Emergent Threats (LEET, 2011), the International Symposium on Recent Advances in Intrusion Detection (RAID, 2007), and the ACM Workshop on Recurring Malcode (WORM, 2007). He was also the head of a working group that advised the European Commission (EC) on defenses to mitigate future threats against the Internet and Europe's cyber-infrastructure.
Dr. Christopher Kruegel