david: tech

Anomalies: How to find what you’re looking for, without looking for it

Very often you want to find “problems” in your IT data, but you don’t know what to look for. How can you find these problems with Splunk?

In Splunk’s new search language, there are several search operators that can help you. I’ll describe only a subset of what is possible.

  • 1) You can search for unexpected events by looking at those that do not cluster into large groups. For example, you can cluster the errors in the last hour and report on the events the belong in the smallest clusters (e.g., ‘error | cluster showcount=true | sort - cluster_count | head 5′).
  • 2) You can find unexpected events by finding values that are far from the standard deviation. For example, you can search for sendmail events with anomalous ‘delay’ values (e.g., ’sourcetype=sendmail_syslog | anomalousvalue delay action=filter pthresh=0.02′).
  • 3) You can use machine learning to find events that have unexpected values based on the past historical context (e.g., ‘* | anomalies blacklist=boringevents’).
  • 4) It’s a little bit of a hand-wave — but you can do really cool graphical reports that often make anomalies visibly obvious. For example, you could create a timechart of average cpu_seconds by host, and visibly see problems (e.g., ’sourcetype=top | timechart avg(cpu_seconds) by host’).

Write your own search language

Splunk provides many power search commands — such as sort, fields, transactions — but even better, it allows you to expand things anyway you want, by writing your own search commands.

I’ll show you how to write your own search command.

Simple Transactions

In this post, I’ll show you how to use Splunk’s Transaction search, with several powerful examples.

O’Rly?

Below are a few easter egg features found inside Splunk.

  • From the commandline: “splunk ftw” produces an ascii-art “O’Rly?“.
  • From the commandline: the “outputrawr” produces ascii-art fireworks.
  • From the searchbox, piping results to the “marklar” processor (e.g. “*|marklar”), converts all search results into the Marklarian language.
  • From the searchbox, piping result to the “loglady” processor (e.g., “*|loglady”), converts all the search results into quotes from Twin Peaks’s LogLady.

Enjoy them while they last, before they are removed by the Silliness Police, who%$($%%$
^H^H^H^NO CARRIER

Tutorial: Event Types in 3.2

Hi, I’m David Carasso, perhaps you’ve seen my famous File Classifier Video. It’s the number one video at CurrentTV.

Below is a second screen capture video that I just made to describe Splunk’s new Event Typer. The Event Typer dynamically tags system events in custom, yet, universal ways. For example, I can say that for any event that happens on Sunday, that has ’status=Fatal’, and that has “sourcetype=weblogic”, to be dynmaically tagged as a “weekend_fatal_weblogic” event. Topics covered include: what is an event type; how to search, view, and count event types; creating an event type; creating an event-type template; and discovering event-types.

Yes, production value is what you’ve come to expect from a Carasso Production. That’s right 15 minutes of unscripted nerd talk. Now with a bonus 45 seconds of video as I type in an off-camera window. But I promise you’ll learn a few useful things you didn’t know.
EventTyperVideo (15 minutes of emacs magic)

Tutorial: File Classifier

Hi, I’m David Carasso and below is a screen capture video I just made to describe Splunk’s File Classifer. The File Classifier takes a file and tell you what type it is. From that sourcetype we determine what to do with the file and how to process it. It’s pretty critical for properly handling a file, including time-stamping events and aggregating multiple lines into single events. There are several methods that the File Classifer uses to classify a file, and we’ll cover each one with real-world examples.

Yes, production value is at a new low here as I cover 18 minutes unscripted, but I promise you’ll learn a few useful things you didn’t know. There’s a free Splunk t-shirt for the commentor that guesses the actual number of times I say “uhhhhh”.

File ClassifierVideo (18 minutes of action packed emacs video)

Semi-Automatic Discovery of Extraction Patterns for Log Analysis

Here’s a paper I recently wrote on some of the automatic field extraction we’re doing with Splunk.

Abstract
This paper presents an interactive bootstrapping process used in Splunk that automatically learns to extract fields from log events. End users simply select one or more example values of a field and a learning process discovers additional instances, along with the patterns to extract them. The user is able to correct the instances and save the extraction patterns. Immediately afterward, while searching log events the newly-taught fields will be extracted from the event’s raw text.

Click here to read full paper

Feedback appreciated.