How Machine Learning for Cybersecurity Can Thwart Insider Threats

How Machine Learning for Cybersecurity Can Thwart Insider Threats

machine learning for cybersecurity FIWhile there are innumerable cybersecurity threats, the end goal for many attacks is data exfiltration. Much has been said about using machine learning to detect malicious programs, but it’s less common to discuss how machine learning can aid in identifying other types of notable threats.

Critically, machine learning can be key in detecting one of the most insidious types of malicious actors – one with legitimate access to your network and systems. When properly trained, machine-learning algorithms can be used to identify insider threats and frauds before they become dangerous.

What Defines an Insider Threat?

When people hear the term “insider threat,” many of them imagine an employee gone rogue, a disgruntled member of your team committing corporate espionage and leaking sensitive data or documents to competitors or criminals.

While this post is certainly not claiming that this type of threat never happens – it certainly does – it’s vanishingly rare. When cybersecurity professionals discuss an insider threat, it almost always refers to the far more common scenario in which a malicious external actor has gained access to your network not through exploits or brute force, but through compromised credentials – using a legitimate, albeit fraudulently obtained, user name and password.

In other words, the most common type of insider threat is that of someone obtaining an employee’s login information and using it to access your system. Ordinary protection against malicious attacks and malware is unable to defend against this type of attack; to an unsophisticated system, a foreign actor using legitimate credentials is indistinguishable from an employee working remotely.

These credentials can be obtained a number of ways, typically through human error on the part of the targeted employee:

  • Phishing. The most common type of cyberattack, phishing schemes involve a malicious actor posing as a trusted source – for instance, a bank or IT help desk – in order to obtain private information, like bank account numbers or passwords.
  • Malware. While insider threats, by definition, typically do not involve any kind of detectable malware, malware can be used to obtain the credentials in question. For instance, an employee who clicks on a suspicious link or inserts an unknown USB drive into their computer might inadvertently install a keylogger program, which is used to identify their work password.
  • Repeated passwords. If your employee uses a password for work that they also use on any other system, a data breach may result in this password winding up in the hands of cybercriminals, without any hint of suspicious behavior in your systems.

As with many common types of cyberattacks, these attacks exploit social engineering strategies to trick your employees into making critical mistakes. The best defense, of course, is proper training to ensure your employees can recognize and avoid typical cyberattacks.

However, let us assume that, through natural human fallibility, the worst has happened: Your employee’s credentials have fallen into the hands of an attacker, who is now, apparently legitimately, inside your system. How can machine learning protect you?

Stopping Internal Threats

In addition to identifying malicious programs, machine learning can be used to identify internal threats. Behavioral analysis can be applied to actions taken within a system, in addition to malicious programs.

Suspicious activities may include:

  • Downloading an excessive number of files
  • Printing an excessive number of files
  • Connecting from strange locations, such as locations in other countries
  • Visiting unusual sites or accessing sites through an unknown VPN

Of course, most insider threats are accidental, not malicious, but suspicious activity performed by user accounts may also be a sign that a malicious attacker has compromised a user account.

Data exfiltration is commonly the objective of attackers, and in order to prevent it, certain malicious network activity needs to be identified:

  • C&C communication. Command and control communication occurs when one machine is issuing commands to another machine, often rooted or otherwise impacted by malicious programs.
  • Changing permissions. Attempting to change permissions often indicates that an individual is attempting to gain access to files that they otherwise shouldn’t be allowed to. This can commonly happen when a malicious attacker has accessed a user account, but that user account doesn’t have the authority the attacker needs.
  • Suspicious data movement. If data is rapidly accessed or transferred, it’s indicative that someone is looking for a certain type of data or attempting to – or preparing to attempt to – move large amounts of data out of the network.

By focusing on identifying these behaviors, a cybersecurity solution can identify many different types of threats. This is like securing a safe by detecting the safe being opened, rather than securing the safe with a perimeter wall: It’s a more direct method of security that requires far fewer resources overall. A perimeter wall isn’t going to alert you if someone goes over it or under it, but the safe will always see activity when someone tries to access it.

Key Considerations

Machine learning is not without drawbacks. It uses samples to learn what the regular activity is for a system, which means it can be tricked: if a system is fed the wrong data over time, it may learn that this incorrect data is regular. This is known as adversarial learning, and it is a rapidly evolving response to machine learning for security.

Another method cybercriminals may use is to analyze a system, then identify the way that it detects legitimate programs, and ultimately provide a program that works in exactly that way. This is known as mimicry. By backward compiling what looks like a legitimate program to a known antivirus solution, cybercriminals can thwart detection.

More importantly, however, is the fact that human supervision is still required to analyze any notifications generated from machine-learning tools and to judge whether they may or may not indicate a threat. Machine learning isn’t intended to replace all human management and maintenance. Rather, it’s intended to be used as a time saver, reducing the amount of time that IT professionals must spend managing their security, providing an initial analysis, and automatically providing alerts with as much detail and context as possible to speed remediation.

Analysts are still necessary to react swiftly to any truly malicious or suspicious activity. There are still challenges related to machine learning, insofar as it must be trained correctly. If machine learning for cybersecurity solutions are not trained correctly, they may create a prohibitive number of false positives or may miss threats entirely.

Are you interested in learning more about machine learning or data exfiltration? Lastline provides next-generation machine-learning technology that can be trained to identify the potential hallmarks of threats and minimize potential false positives. Learn more about machine learning for cybersecurity here.

Andy Norton

Andy Norton

Andy has been involved in cyber security best practice for over 20 years, specializing in establishing emerging security technologies at Symantec, Cisco and FireEye. In that time, he has presented threat and intelligence briefings for both Bush and Obama administrations, The Cabinet office, the Foreign and Commonwealth office, SWIFT, Swiss National Bank, Prudential Regulation Authority, the Bank of England, The Hong Kong Monetary Authority and NASA. Returning to Europe from Asia in 2011, he has spent the past 5 years helping many of the FTSE 250 companies measure, manage and respond to cyber incidents.
Andy Norton