From Trapping to Hunting: Intelligently Analyzing Anomalies to Detect Network Compromises
Breach Detection Systems (BDS) trap attacks that display sufficient evidence of a possible breach, but are at risk of false positives when the sensitivity level is set too low. Hunting attacks with anomaly detection systems can detect the attacks that are not trapped by the BDS.
Breach Detection Systems identify patterns of events in order to detect network compromises. Event streams include:
- Network activity – for example, DNS activity and HTTP requests
- Host activity – for example, user logins and the invocation of programs
- The analysis of various artifacts that are observed in the network – for example, the reports produced by sandboxing technology that characterizes the behavior of programs and documents).
Detect Network Compromises
One of the goals of BDS is to provide the most effective automated detection with minimal false positives because excessive false positives cause “fatigue” in the incident responder. This means that the sensitivity threshold of a BDS system must be set so that an alert is generated only when a substantial amount of supporting evidence is gathered.
For activity that falls below the established threshold, the information gathered by a BDS system can be used to detect attacks that cannot be automatically identified with a high degree of confidence. These attacks, instead of having a clear pattern of malicious behavior, are identifiable because their actions are anomalous when compared to normal network traffic.
Anomaly detection works under the assumption that malicious activity will result in anomalies in some event stream, and, at the same time, anomalies in an event stream are caused by malicious activity. Unfortunately, in the real world, both assumptions are sometimes incorrect, and anomaly detection has been riddled by both false negatives (because malicious activity does not always generate anomalies) and false positives (because a benign activity is sometimes anomalous).
Even though pure anomaly detection might not work in the general case, it can still provide hints for where an analyst might look more deeply to make connections between seemingly unrelated events. This is a new approach that instead of providing machine-based, automated detection supports human-centered analysis of interesting events. In a nutshell, the analyst moves from being a trapper to being a hunter.
A hunting system provides a series of observations that are either anomalous per se (according to a pre-established model), or anomalous when put in the context of the historical behavior of a network or a user. The following examples illustrate the observations produced by a hunting system:
- Traffic to a specific host, network, or country: A specific host might have an established profile that shows that exchanges from external hosts are when data is downloaded, while the only outgoing data is to internal hosts (e.g., storage systems). The hunting system generates an observation when this behavior changes, and, suddenly, the unusually large amount of data is being uploaded to an external host.
- Session timing and duration: Many users have very predictable patterns of logins and logouts. The hunting system learns the patterns of system usage and generates an observation when it detects, for example, the late-night login for a user that has had a reliable 9:00am-5:00 pm work pattern.
- Use of particular tools, application, and protocols: Users can be characterized by the toolset they rely on to perform their everyday tasks. For example, employees in the accounting department rarely use compilers. As another example, an observation might be generated if a user that has never used a remote desktop protocol start opening remote sessions to internal hosts.
The important aspect of these examples is that none of the resulting observations are necessarily the result of a compromise. A user might have been tasked to perform a remote backup with a cloud-based service, resulting in a large amount of data being transferred to an outside host, could be under pressure for a deadline and working off-hours, or might start using new tools to improve productivity, all of which would still result in network traffic anomalies possibly detected by a trapping system and generating false positives.
However, a hunting system would be able to provide a human analyst with the ability to analyze, sort, connect, correlate, and expand these observations. A human analyst might be able to recognize that the sudden spike in upload traffic, correlated with an unusual session time for a user in a department that is notorious for not working late hours is enough to warrant an in-depth investigation. Or, the appearance of chains of remote desktop connections (from host A to host B to host C), a pattern that has never been observed before, together with an anomalous number of failed accesses to a shared file system might be worth the attention of the hunter, as it could be evidence that an intruder has gained access to the system and is poking around, trying to get a larger foothold in the network.
Hunting Tool Features
Fundamentally, the hunting tool does five things:
- Collects various event streams (login records, DNS resolutions, Netflow data, etc.)
- Models each type of event stream and each involved element (user, host, network)
- Reports unusual activity observed in the event streams, based on the established model
- Presents the observations in different ways, to highlight connections and support sophisticated analysis
- Expands the analysis by retrieving additional information about the network and its hosts (e.g., by using system like queries) to augment the current observations
NOTE: In this list, we have not added the ability to respond (e.g., by isolating a suspicious node or by disabling a user account), as this functionality is usually already provided by existing toolsets. In any case, it would be trivial to integrate the hunting tool with IT and network administration tools.
It is clear that while the “Collects” and “Presents” functionality is less difficult to design, the “Models” and “Reports” components are the ones that require the development of novel approaches in order to produce relevant observations that contain sufficient explanatory power (just observing that something is weird is often not enough: the system must explain why something is weird).
Also, the target user of this tool is a sophisticated user who is able to use his own domain knowledge about the network being protected in order to go beyond just passively absorbing the output of BDS systems, into the realm of investigation. Many enterprises lack the resources to dedicate to this task.
Accordingly, my recommendation is to consider using a Managed Security Service Provider (MSSP), which provides “hunting services” to their clients, opening a new approach to managed security that is not only reactive but also proactive.
A hunting tool also would appeal to CISOs and CIOs, as they can highlight situations, such as insider attacks, that are beyond the detection capability of current breach detection systems. In addition, the ability to highlight anomalous events support network health in general and might identify issues that are not security-related but might result in opportunities to improve operational efficiency.
If you’re interested in learning more, I’d be happy to provide a detailed bibliography of papers that touch on using machine learning and anomaly detection to identify attacks.