Experimental App Helps Find Other Splunkbase Apps

I’ve recently developed a Splunk app called “splunkbase“.  It looks at your Splunk installation and suggests apps on splunkbase.com relevant to your data.  It analyzes your indexed data, as well as data in your file system not yet indexed.  It also suggests apps based on what other Splunk users have installed at similar installations — sort of like how Amazon will suggest items to purchase based on what other users similar to you have purchase.

The app is simple to run — it’s just one dashboard, with several reports that suggest apps.

Security: At no time is any of your data uploaded or forwarded on. The signatures of all free splunkbase apps are included with this app so…

» Continue reading

SPLogger: iPhone Logging API

This week I put up on GitHub an early version of a Splunk logging API for iPhone developers, call SPLogger.  We’d love feedback, code contributions, suggestion.  The SPLogger API allows iPhone developers to log events in their application and have them go to Splunk Storm (www.splunkstorm.com), which is free for up to a GB of data. If you currently have no insight into how your app is being used, or by whom, this can come in handy, and of course you’ll have the full power of SPL, Splunk’s search language.

To get the SPLogger API, download it via either method:

» Continue reading

Predicting Missing Data


Teach Splunk to predict missing field values in your data!  With the brand new Splunk Predict App, you can predict, and fill-in, the value of missing fields in your data, using training sets that have values.   This app builds Naive Bayes models to predict field values.  In some test sets, this model often predicted values correctly 99.95%+ of the time.

  • From customers that fill out their gender, you can predict the gender of customers that have not, perhaps based on writing style, word choice, or other features.
  • From events that list a host name, you can predict the host name for events that are missing it.
  • From customers that explain why they

» Continue reading

BOOK EXCERPT: When to use “transaction” and when to use “stats”

EXCERPT FROM “EXPLORING SPLUNK: SEARCH PROCESSING LANGUAGE (SPL) PRIMER AND COOKBOOK”. Kindle/iPad/PDF available for free, and hardcopy available for purchase at Amazon.

There are several ways to group events with the Search Processing Language (SPL). The most common approach uses either the transaction or stats command. But when should you use transaction and when should you use stats?

The rule of thumb: If you can use stats, use stats. It’s faster than transaction, especially in a distributed environment. With that speed, however, comes some limitations. You can only group events with stats if they have at least one common field value and if you require no other constraints. Typically, the raw event text is

» Continue reading

You’re happier with fewer friends

Using the new Splunk Sentiment Analysis app I was able to correlate how positive tweets were, depending on how many people follow a twitter account. It’s a slight stretch, but essentially, are you happier with more friends?

index=twitter | sentiment twitter body | chart avg(sentiment) by actor.followersCount

It seems that people with smaller circles of friends are more positive. More friends equals more negativity, up until about 75 friends. Seems like a fairly good life lesson, but take it a grain of salt — spam twitter accounts may skew things.

» Continue reading

Book Excerpt: Finding Specific Transactions

EXCERPT FROM “EXPLORING SPLUNK: SEARCH PROCESSING LANGUAGE (SPL) PRIMER AND COOKBOOK”. Kindle/iPad/PDF available for free, and hardcopy available for purchase at Amazon.

Problem

You need to find transactions with specific field values.

Solution

A general search for all transactions might look like this:

          sourcetype=email_logs | transaction userid

Suppose, however, that we want to identify just those transactions where there is an event that has the field/value pairs to=root and from=msmith. You could use this search:

   sourcetype=email_logs
   | transaction userid
   | search to=root from=msmith

The problem here is that you are retrieving all events from this sourcetype (potentially billions), building up all the transactions, and then throwing 99% of the data right in to the bit…

» Continue reading

Removing Duplicate Consecutive Events

EXCERPT FROM “EXPLORING SPLUNK: SEARCH PROCESSING LANGUAGE (SPL) PRIMER AND COOKBOOK”. Kindle/iPad/PDF available for free, and hardcopy available for purchase at Amazon.

Problem

You want to group all events with repeated occurrences of a value in order to remove noise from reports and alerts.

Solution

Suppose you have events as follows:

          2012-07-22 11:45:23 code=239
          2012-07-22 11:45:25 code=773
          2012-07-22 11:45:26 code=-1
          2012-07-22 11:45:27 code=-1
          2012-07-22 11:45:28 code=-1
          2012-07-22 11:45:29 code=292
          2012-07-22 11:45:30 code=292
          2012-07-22 11:45:32 code=-1
          2012-07-22 11:45:33 code=444
          2012-07-22 11:45:35 code=-1
          2012-07-22 11:45:36 code=-1

Your goal is to get 7 events, one for each of the code values in a row: 239, 773, -1, 292, -1, 444, -1. You might be tempted

» Continue reading

Transaction Searching: Unifying Field Names

EXCERPT FROM “EXPLORING SPLUNK: SEARCH PROCESSING LANGUAGE (SPL) PRIMER AND COOKBOOK”. Kindle/iPad/PDF available for free, and hardcopy available for purchase at Amazon.

Problem

You need to build transactions from multiple data sources that use different field names for the same identifier.

Solution

Typically, you can join transactions with common fields like:

          ... | transaction username

But when the username identifier is called different names (login, name, user, owner, and so on) in different data sources, you need to normalize the field names.

If sourcetype A only contains field_A and sourcetype B only contains field_B, create a new field called field_Z which is either field_A or field_B, depending on which is present in

» Continue reading

Splunk Book Excerpt: Finding Metrics That Fell by 10% in an Hour

EXCERPT FROM “EXPLORING SPLUNK: SEARCH PROCESSING LANGUAGE (SPL) PRIMER AND COOKBOOK”. Kindle/iPad/PDF available for free, and hardcopy available for purchase at Amazon.

Problem

You want to know about metrics that have dropped by 10% in the last hour. This could mean fewer customers, fewer web page views, fewer data packets, and the like.

page91image14920
page91image15192
page91image15464
page91image15736
page91image16008

Solution

To see a drop over the past hour, we’ll need to look at results for at least the past two hours. We’ll look at two hours of events, calculate a separate metric for each hour, and then determine how much the metric has changed between those two hours. The

» Continue reading

Splunk Book Excerpt: Grouping Events

EXCERPT FROM “EXPLORING SPLUNK: SEARCH PROCESSING LANGUAGE (SPL) PRIMER AND COOKBOOK”. Kindle/iPad/PDF available for free, and hardcopy available for purchase at Amazon.

Grouping Events

There are several ways to group events. The most common approach uses either the transaction or stats command. But when should you use transaction and when should you use stats?

The rule of thumb: If you can use stats, use stats. It’s faster than transaction, especially in a distributed environment. With that speed, however, comes some limitations. You can only group events with stats if they have at least one common field value and if you require no other constraints. Typically, the raw event text is discarded.

Like

» Continue reading