Machine Learning Is Transforming Malware Detection

Machine Learning Is Transforming Malware Detection

Machine Learning is an important component in detecting advanced malware

Machine Learning is an important component in detecting advanced malware, but to be effective it must be well-grounded with known threat intelligence.

Dr. Giovanni Vigna, Co-founder and CTO of Lastline, presented his thoughts regarding advanced malware protection at this year’s RSA conference in San Francisco. He spoke about how machine learning is transforming malware detection, especially when it is well-grounded with known threat intelligence.

wikipedia machine learning

During his presentation, Dr. Vigna outlined malware detection’s evolution from technologies based on models of known bad malware, to more recent methodologies that model good network behavior.

Technologies that use signatures or models of previously recognized malware are quite effective in detecting well-known malicious objects. However, they have two serious shortcomings. First, they require an enormous amount of time from skilled security experts to develop, and second, they can’t detect new strains of malware. To address these inadequacies, there’s a new breed of malware detection tools based on modeling good or normal network behavior.

The idea is, that by developing models of normal network behavior, it will be easy to spot abnormal behaviors. The assumption is that normal events are good, therefore abnormal events must be bad, or at least suspicious.

Machine Learning Can Spot Deviations

Machine learning and artificial intelligence are very good at processing massive amounts of data and building models of normal behavior. Machine learning can also spot any deviations with relative ease. In theory, this solves two major inadequacies of tools that rely on models of previously known malware. First, it’s no longer necessary to develop signatures or complex models of existing malicious objects—thereby liberating precious security staff for other tasks. Secondly, the technology also detects new malware strains.

Good Behavior Versus Bad Behavior

On the surface, it appears that modeling normal network behavior is the holy grail of malware detection. Unfortunately, it’s not that simple. Normal network events are not always good, and abnormal events are not always bad. For example, the first time an employee works at two o’clock in the morning would be an anomaly, but it’s not necessarily a threat. It might simply be a legitimate employee working in the middle of the night for the first time in their career. Likewise, a normal looking data transfer might actually be insider theft. The basic assumptions that normal is good and abnormal is bad don’t hold up, at least not all the time.

So, unless enhanced, a machine learning tool (by its very nature) will generate a lot of false positives or negatives. This will flood the security staff with events to investigate. Adjustments can generate fewer alerts, but that comes at the risk of missing actual malicious events.

A Well-grounded System Is Crucial

To address these issues, machine learning algorithms must be well-grounded. Dr. Vigna defined this concept as machine learning algorithms that detect abnormal behaviors, but also utilize models that understand known threats—thereby refining actions or alerts. In a well-grounded system, an anomaly detection engine identifies all unusual behaviors, where unusual behaviors are further tested against known malicious characteristics and behaviors. A large file transfer at two o’clock in the morning might be an anomaly, but it’s probably benign unless it’s also associated with something that’s known to be malicious.

To be effective at detecting advanced malware, anomalous events require further tests for associations with known malicious entities or capabilities like:

  • Known compromised hosts
  • Known malicious IP addresses or geographic locations
  • Anonymizing networks
  • Unusual encryption capabilities
  • Known C&C (command and control) systems
  • Other known adversaries
  • Unauthorized services or other processes

In summary, Dr. Vigna made a compelling presentation. Machine learning and artificial intelligence are adding exciting new malware detection capabilities. However, unless these tools are well-grounded with intelligence about known threats and the way they operate, they are not effective.

Learn more about Lastline’s innovative features and how it can help protect your organization.

Brian Laing

Brian Laing

For more than 20 years, Brian Laing has shared his strategic business vision and technical leadership with a range of start-ups and established companies in various executive level roles. The author of “APT for Dummies,” he was previously vice president of AhnLab, where he directed the US operations of the internationally known security and software leader. Brian previously founded Hive Media where he served as CEO. He co-founded RedSeal Systems, where he conceived the overall design and features of the product and was granted two patents related to network security. He was also founder and CEO of self-funded Blade Software, who released the industry’s first commercial IPS/FW testing tool.
Brian Laing