Experimental App Helps Find Other Splunkbase Apps
I’ve recently developed a Splunk app called “splunkbase“. It looks at your Splunk installation and suggests apps on splunkbase.com relevant to your data. It analyzes your indexed data, as well as data in your file system not yet indexed. It also suggests apps based on what other Splunk users have installed at similar installations — sort of like how Amazon will suggest items to purchase based on what other users similar to you have purchase.
The app is simple to run — it’s just one dashboard, with several reports that suggest apps.
Security: At no time is any of your data uploaded or forwarded on. The signatures of all free splunkbase apps are included with this app so…
SPLogger: iPhone Logging API
This week I put up on GitHub an early version of a Splunk logging API for iPhone developers, call SPLogger. We’d love feedback, code contributions, suggestion. The SPLogger API allows iPhone developers to log events in their application and have them go to Splunk Storm (www.splunkstorm.com), which is free for up to a GB of data. If you currently have no insight into how your app is being used, or by whom, this can come in handy, and of course you’ll have the full power of SPL, Splunk’s search language.
To get the SPLogger API, download it via either method:
…
Predicting Missing Data
![]()
Teach Splunk to predict missing field values in your data! With the brand new Splunk Predict App, you can predict, and fill-in, the value of missing fields in your data, using training sets that have values. This app builds Naive Bayes models to predict field values. In some test sets, this model often predicted values correctly 99.95%+ of the time.
- From customers that fill out their gender, you can predict the gender of customers that have not, perhaps based on writing style, word choice, or other features.
- From events that list a host name, you can predict the host name for events that are missing it.
- From customers that explain why they
…
BOOK EXCERPT: When to use “transaction” and when to use “stats”
EXCERPT FROM “EXPLORING SPLUNK: SEARCH PROCESSING LANGUAGE (SPL) PRIMER AND COOKBOOK”. Kindle/iPad/PDF available for free, and hardcopy available for purchase at Amazon.
There are several ways to group events with the Search Processing Language (SPL). The most common approach uses either the transaction or stats command. But when should you use transaction and when should you use stats?
The rule of thumb: If you can use stats, use stats. It’s faster than transaction, especially in a distributed environment. With that speed, however, comes some limitations. You can only group events with stats if they have at least one common field value and if you require no other constraints. Typically, the raw event text is
…
You’re happier with fewer friends
Using the new Splunk Sentiment Analysis app I was able to correlate how positive tweets were, depending on how many people follow a twitter account. It’s a slight stretch, but essentially, are you happier with more friends?
index=twitter | sentiment twitter body | chart avg(sentiment) by actor.followersCount
It seems that people with smaller circles of friends are more positive. More friends equals more negativity, up until about 75 friends. Seems like a fairly good life lesson, but take it a grain of salt — spam twitter accounts may skew things.
Book Excerpt: Finding Specific Transactions
EXCERPT FROM “EXPLORING SPLUNK: SEARCH PROCESSING LANGUAGE (SPL) PRIMER AND COOKBOOK”. Kindle/iPad/PDF available for free, and hardcopy available for purchase at Amazon.
Problem
You need to find transactions with specific field values.
Solution
A general search for all transactions might look like this:
sourcetype=email_logs | transaction userid
Suppose, however, that we want to identify just those transactions where there is an event that has the field/value pairs to=root and from=msmith. You could use this search:
sourcetype=email_logs | transaction userid | search to=root from=msmith
The problem here is that you are retrieving all events from this sourcetype (potentially billions), building up all the transactions, and then throwing 99% of the data right in to the bit…
Removing Duplicate Consecutive Events
EXCERPT FROM “EXPLORING SPLUNK: SEARCH PROCESSING LANGUAGE (SPL) PRIMER AND COOKBOOK”. Kindle/iPad/PDF available for free, and hardcopy available for purchase at Amazon.
Problem
You want to group all events with repeated occurrences of a value in order to remove noise from reports and alerts.
Solution
Suppose you have events as follows:
2012-07-22 11:45:23 code=239
2012-07-22 11:45:25 code=773
2012-07-22 11:45:26 code=-1
2012-07-22 11:45:27 code=-1
2012-07-22 11:45:28 code=-1
2012-07-22 11:45:29 code=292
2012-07-22 11:45:30 code=292
2012-07-22 11:45:32 code=-1
2012-07-22 11:45:33 code=444
2012-07-22 11:45:35 code=-1
2012-07-22 11:45:36 code=-1
Your goal is to get 7 events, one for each of the code values in a row: 239, 773, -1, 292, -1, 444, -1. You might be tempted
…
Transaction Searching: Unifying Field Names
EXCERPT FROM “EXPLORING SPLUNK: SEARCH PROCESSING LANGUAGE (SPL) PRIMER AND COOKBOOK”. Kindle/iPad/PDF available for free, and hardcopy available for purchase at Amazon.
Problem
You need to build transactions from multiple data sources that use different field names for the same identifier.
Solution
Typically, you can join transactions with common fields like:
... | transaction username
But when the username identifier is called different names (login, name, user, owner, and so on) in different data sources, you need to normalize the field names.
If sourcetype A only contains field_A and sourcetype B only contains field_B, create a new field called field_Z which is either field_A or field_B, depending on which is present in
…
Splunk Book Excerpt: Finding Metrics That Fell by 10% in an Hour
EXCERPT FROM “EXPLORING SPLUNK: SEARCH PROCESSING LANGUAGE (SPL) PRIMER AND COOKBOOK”. Kindle/iPad/PDF available for free, and hardcopy available for purchase at Amazon.
Problem
You want to know about metrics that have dropped by 10% in the last hour. This could mean fewer customers, fewer web page views, fewer data packets, and the like.
Solution
To see a drop over the past hour, we’ll need to look at results for at least the past two hours. We’ll look at two hours of events, calculate a separate metric for each hour, and then determine how much the metric has changed between those two hours. The
…
Splunk Book Excerpt: Grouping Events
EXCERPT FROM “EXPLORING SPLUNK: SEARCH PROCESSING LANGUAGE (SPL) PRIMER AND COOKBOOK”. Kindle/iPad/PDF available for free, and hardcopy available for purchase at Amazon.
Grouping Events
There are several ways to group events. The most common approach uses either the transaction or stats command. But when should you use transaction and when should you use stats?
The rule of thumb: If you can use stats, use stats. It’s faster than transaction, especially in a distributed environment. With that speed, however, comes some limitations. You can only group events with stats if they have at least one common field value and if you require no other constraints. Typically, the raw event text is discarded.
Like
…










