Cannot search based on an extracted field

UPDATE: in 4.3 and after search time fields extracted from indexed fields work without any further configuration

In the past couple of days I had to help people from support and professional services troubleshoot the exact same problem twice, so chances it might be useful for you too ;)

The problem
I have setup a regex based field extraction, let’s say the field name is MyField. When I run a search, say “sourcetype=MyEvents” I see that the field is extracted correctly. However, when I run a search based on a value of MyField, say “sourcetype=MyEvents MyField=ValidValue” nothing gets returned. WTF?

The solution
For the impatient, here’s how to solve this.

$SPLUNK_HOME/etc/system/local/fields.conf
[MyField]
INDEXED_VALUE =

» Continue reading

Storing encrypted credentials

Splunk 4.2 was released today and your new resolution:

Build the greatest Splunk app that gathers data from all different source, some that are public and others that require credentials, index them in Splunk and then do some cool things with it.

This blog post will only be concerned with one small, but important aspect of your great app: how to securely store user credentials yet be able to safely access them in clear text when needed. I will split up the post into four sections: get credentials from the user,  access them from your script, where are the credentials stored and security implications.

Get and securely store user credentials

The best time to get user credentials for you app…

» Continue reading

Alert Throttling

NOTE: in 4.2 (released today 3/15/2011) alert suppression/throttling is supported natively by Splunk

Most splunk users soon realize that splunk ships with a scheduler which can be used to run searches periodically and execute some actions (send an email, generate an rss feed , call a script etc) when the results of the search meet some condition. Soon after discovering this feature many users proceed to looking for some mechanism to throttle the alerts issued by splunk.  For example, a common use pattern for alerts is:  check the health of a resource every 5 minutes and send an email alert when the resource is unhealthy, but only send out emails at most every hour.  As of the most recent release…

» Continue reading

Delimiter base KV extraction – advanced

If you’ve read my previous post on delimiter based KV extraction, you might be wandering whether you could do more with it (Anonymous Coward did). Well, yes you can, I am going to cover the “advanced” cases here. Before covering the capabilities, as in other posts, I would first go over some observations and examples.

Observations

  1. Header-body. Some applications, for different reasons, choose to format their log files using a header and a body section. The header usually describes the way the fields are organized in each logged event, while the body consists of logged events, usually one per line, with field values delimited as described in the header. W3C, CSV etc come to mind, see examples
  2. Single-delimiter.

» Continue reading

Delimiter based key-value pair extraction

As described in my previous post, key-value pair extraction (or more generally structure extraction) is a crucial first step to further data analysis. While automatic extraction is highly desirable, we believe empowering our users with tools to apply their domain knowledge is equally important. To this end, this post introduces one of the simplest forms of key-value pair extractions (KV-extraction) – delimiter based extraction.

Observation

Most logged events usually contain a list of key-value pairs (e.g. attribute list, method call values etc) in a context-dependent well-defined format. An example of well-defined format: ” key-value pairs are separated from each other using ‘;’ while the key is separated from the value using ‘=’ “. More generally, well defined attribute…

» Continue reading

Key-value pair extraction definition, examples and solutions….

Most of the time logs contain data which, by humans, can be easily recognized as either completely or semi-structured information. Being able to extract structure in log data is a necessary first step to further, more interesting, analysis. While it would be great to be able to automatically extract the structure from all log data, splunk cannot rival the brain’s performance at this time, however it is able to tap into your brain for help :) Read on ……

Problem definition:

Extract structured information (in the form of key/field=value form) from un/semi-structured log data. Note: for the purpose of this post key or field are used interchangeably to denote a variable name.

Problem examples:

Splunk debug…

» Continue reading