Digital Resilience Pays Off
Download this e-book to learn about the role of Digital Resilience across enterprises.
Teach Splunk to predict missing field values in your data! With the brand new Splunk Predict App, you can predict, and fill-in, the value of missing fields in your data, using training sets that have values. This app builds Naive Bayes models to predict field values. In some test sets, this model often predicted values correctly 99.95%+ of the time.
If you have the actual field value in question, use the predicted value against the actual value to determine if values are unexpected. Does the event’s data look like it belongs in this source of data, or is it suspicious.
Suppose you have a dataset with missing or questionable values. You can now predict the missing values based on other values. For example, in human entered data or social media data (e.g., twitter), imagine predicting the political or demographic information based on zipcode, first name, salary etc. Alternatively, you have one dataset that has a field filled out and another data set where that field is missing or sporadic.
Lastly, you can use the Predict app for sentiment analysis. For example, you can have a small training set of emails, each marked up with “angry=10” or “angry=1”, and have it learn to recognize angry emails. Angry emails can get directly routed to a manager.
This app includes four search commands:
For details on the parameters for each of these commands, typeahead will provide all the defaults. Make sure to click More on the typeahead instructions.
For example, to learn gender from names, you might say train it with:
gender=* | fields name, gender | train name2gender from gender
If you don’t limit the fields to “name” and “gender” it will use all fields to predict gender. If you have an inkling of what fields can predict other fields, limit things, otherwise, don’t bother and it will figure it out.
You can have it predict “gender” for events that don’t have a gender field specified.
* | guess name2gender into gender
Another example, predict the sourcetype from the _raw text of events. First train a model:
index=_internal | train getsrctype from sourcetype
Then use that model to guess sourcetypes and compare it to the real sourcetype value to measure accuracy:
index=_internal | rename sourcetype as real_sourcetype | fields real_sourcetype
| guess getsrctype into sourcetype | top sourcetype,real_sourcetype
----------------------------------------------------
Thanks!
David Carasso
The Splunk platform removes the barriers between data and action, empowering observability, IT and security teams to ensure their organizations are secure, resilient and innovative.
Founded in 2003, Splunk is a global company — with over 7,500 employees, Splunkers have received over 1,020 patents to date and availability in 21 regions around the world — and offers an open, extensible data platform that supports shared data across any environment so that all teams in an organization can get end-to-end visibility, with context, for every interaction and business process. Build a strong data foundation with Splunk.