Top Technical Questions on Splunk UBA

With the acquisition of Caspida (now Splunk UBA) in July of 2015, we have been talking to many customers regarding user and entity behavioral analytics. Our customers have been asking questions about how this type of threat detection product works, and in this blog, I’m going to discuss some of the most common questions, along with answers and/or explanations from a security researcher and practitioner’s viewpoint.

 

What makes Splunk UBA unique compared to detection technologies?

Splunk UBA uses an unsupervised machine-learning based approach to determine whether events generated from multiple data sources are anomalies and/or threats. This is a turnkey approach that does not require customers to train the models, and does not require administrators to develop signatures in advance, in order to detect a threat.

 

What are the common use cases for Splunk UBA?

Splunk UBA can be deployed in any network in order to detect the insider threat, malware, data exfiltration, and other types of activity across the kill chain that would be indicative of malicious behavior.  In general, Security Operations Centers (SOCs) continue to handle more data sources as the enterprise grows and matures, but the team is not necessarily able to staff more analysts at the same growth rate. Most SOC environments have not truly shifted to utilizing machine-learning based approaches today, but they need to consider this type of approach, in order to scale their operations and increase their efficacy.

In Figure 1 below, I’ve illustrated a typical maturity strategy seen in security operations environments today. Initially, an organization that is less mature may rely more heavily on signature-based solutions to catch “low-hanging fruit”. As their security operations capability improves, the team starts deploying more layered defenses that utilizes heuristic-based detection methods, in order to cope with the increase in monitored data sources.

Eventually, as security operations teams successfully defend against traditional threats, attackers start employing custom, tailored techniques that become much more difficult to detect.  This becomes difficult for security teams to reconcile, as they also have to juggle more data sources to monitor and alert against, as the enterprise modernizes and potentially outsources their services to external SaaS providers.  To cope, teams look to automate more of their analytic capabilities by leveraging security solutions that can detect threats across disparate log sources at scale, without requiring analysts to manually parse and interpret all of these different log formats.

 

Messages Image(1528185430)

Figure 1. Detection approaches for maturing a security operations capability over time.

 

How does this approach work with the “low and slow” types of attacks?

First of all, let’s define what is meant by a “low and slow attack”. This class of attacks typically involve what appears to be legitimate traffic, transmitting at very slow rates. The low volume of traffic, coupled with slow rates of transmission improves an attacker’s chance of evading traditional detection methods. Splunk UBA is capable of storing and correlating against event data for very long periods of time, which means this type of attack will be less likely to succeed.

 

How do user identities figure into Splunk UBA?

Unfortunately, Active Directory (AD) logs alone do not work very well in user identity resolution. There are currently ways to map the same user with different User IDs together via a common source (e.g., source IP address, etc.). A file that is referred to as an HR file, or other identity and access management (IAM) logs are considered foundational to the solution. These logs and files will hold values like the user name, email address, SID and NT Username. Centrify Express is an example of an identity management solution, that can be used to help facilitate multi-domain controller environments.

 

What are “peer groups”, and why are they important?

One use case Splunk UBA addresses is insider threat detection. There are many published papers and metrics that discuss various methodologies to detect the insider threat. One model for insider threat detection makes use of “peer groups” to identify anomalies, and possible threats in the network. Peer groups can be used to categorize users by behavior, both expected, and actual. For example, if Bob and Alice are both in the Finance department, and have similar roles, we would expect that they might have similar server access profiles. Both Bob and Alice would have a legitimate reason to access the firm’s financial databases. Conversely, neither of them would have a business need to access source code repository servers within the Engineering department. This is peer group representation.

There are other ways to group users by actual behavior. For example, Bob and Mallory are not in the same department. Mallory is in the Engineering department. However, Bob and Mallory have similar work habits. They have the same work hours, and they both come into work, and after they grab coffee, they log into their laptops, and open their browsers and navigate to CNN.com. Therefore, Bob and Mallory might be in a separate peer group than Bob and Alice; however, Bob is a member of both groups. By defining multiple overlapping peer groups per user, anomalies that may not be detected in one peer group may appear in other peer groups.

Although this blog post addresses the most common questions related to Splunk UBA, there are of course, other questions that are not covered in this post. If you have additional questions, please reach out to ubainfo@splunk.com