Facebook, privacy and IT data

Facebook is getting a lot of flak in the press (latest in the Register) about reports on a gossip blog about some pretty serious privacy holes:

1. anyone that works there can look at anyone’s private profile

2. anyone who works there can look at logs of what other profiles any user has seen.

If Facebook wants to turn their act around, or any other social networking site wants to avoid being in their position, they’d better pay attention to some best practices around securing and reviewing IT data.

Here’s what best practice would say about Facebook’s two problems.

The first problem – anyone can look at any customer’s data – is classically the kind of thing that has brought on regulations in other industries, such…

» Continue reading

Tutorial: Event Types in 3.2

Hi, I’m David Carasso, perhaps you’ve seen my famous File Classifier Video. It’s the number one video at CurrentTV.

Below is a second screen capture video that I just made to describe Splunk’s new Event Typer. The Event Typer dynamically tags system events in custom, yet, universal ways. For example, I can say that for any event that happens on Sunday, that has ’status=Fatal’, and that has “sourcetype=weblogic”, to be dynmaically tagged as a “weekend_fatal_weblogic” event. Topics covered include: what is an event type; how to search, view, and count event types; creating an event type; creating an event-type template; and discovering event-types.

Yes, production value is what you’ve come to expect from a Carasso Production. That’s right 15 minutes of unscripted…

» Continue reading
Dev:

Stupid Perforce Trick #1

We use Perforce at Splunk, and it’s worked out pretty well for us. I’m a CVS admin at heart, and I know there’s some SVN sentiment, but p4 gives us a nice mix of atomic commits, attractive GUI and command-line tools, and someone to call for help if it ever completely eats itself.

Over time I’ve compiled a small library of scripts for various p4 functions that have been written time and again at different sites…mergetool is one of them. This little tool accepts a merge target (“yours” in p4-speak) and projectile (“theirs” in p4), labels both, performs an integrate, and performs a “safe” resolve -as. It logs any failures for you to resolve by hand, or submits the change set if…

» Continue reading

Tutorial: File Classifier

Hi, I’m David Carasso and below is a screen capture video I just made to describe Splunk’s File Classifer. The File Classifier takes a file and tell you what type it is. From that sourcetype we determine what to do with the file and how to process it. It’s pretty critical for properly handling a file, including time-stamping events and aggregating multiple lines into single events. There are several methods that the File Classifer uses to classify a file, and we’ll cover each one with real-world examples.

Yes, production value is at a new low here as I cover 18 minutes unscripted, but I promise you’ll learn a few useful things you didn’t know. There’s a free Splunk t-shirt for the…

» Continue reading

Dont forget to index your config files!

Dont forget to index your config files!
Why?
Because splunk is a great way to track changes and see differences in your configs.
For most troubleshooting and compliance situations having a historical recored of all your configurations just goes hand in hand with the log data. They are two sides of the same coin.

The cool thing is that it takes just a few seconds to get up and running. If you have splunk installed its all but free to index your configs – they are small in size compared to log files. Even if you indexed all configs in a 2000 machine deployment it would not come close to the volume of even a small size proxy log.

30 second refresher:
Just tail /etc you…

» Continue reading
Dev:

JavaScript Hybrids (Extending the browser) – Part 1

I deeply enjoy browser programming, however sometimes I wish it could do more. Things like sockets, streams, audio and improved file system handling would be a real treat. Man would it be fresh if I had access to this functionality in JavaScript.

Now this is going to sound pretty circa 98, but several main stream browser plugins support a JavaScript communication layer. According to the Millward Brown survey plugin installations of Flash (99%) and Java (85%) are pretty ubiquitous.

Flash/JavaScript Communication
The Flash ExternalInterface class enables communication between JavaScript and the Flash Player. ExternalInterface was first introduced in ActionScript 1.0; so Flash Player 8 is the minimum plugin version required.
From JavaScript

  • Call an ActionScript function
  • Pass arguments
  • Return a value to the JavaScript callee

From ActionScript

  • Call a JavaScript…
» Continue reading

Being the girl in dev at Splunk

Like a lot of tech companies, Splunk’s development organization isn’t a model of perfect gender balance. For a year and a half now, I’ve been the only woman in the dev organization.

Surprisingly, this is not an uncomfortable place to be. In 11 years in industry I’ve worked in a variety of organizations: the now-bankrupt dot-com best known for putting an ad with a naked guy up during the Super Bowl, 2 major marquee names with vastly differing corporate cultures, a security start-up stocked with emancipated-minor hackers. Aside from that doomed dot-com — which had a surprisingly strong gender balance throughout technical roles and a culture blessedly free of gender-based intimidation at all levels — Splunk may be the most…

» Continue reading
Dev:

Semi-Automatic Discovery of Extraction Patterns for Log Analysis

Here’s a paper I recently wrote on some of the automatic field extraction we’re doing with Splunk.

Abstract
This paper presents an interactive bootstrapping process used in Splunk that automatically learns to extract fields from log events. End users simply select one or more example values of a field and a learning process discovers additional instances, along with the patterns to extract them. The user is able to correct the instances and save the extraction patterns. Immediately afterward, while searching log events the newly-taught fields will be extracted from the event’s raw text.

Click here to read full paper

Feedback appreciated.

» Continue reading

Trekking in the Galapagos

The Splunk cozy has been to a few countries around the world. This month, I took it to the Galapagos, and decided to leave it there at Post Office Bay amongst all the other plaques and memorabilia. I think it’ll be very comfortable for a while. See the rest of my Galapagos photo gallery.

The Galapagos

The Galapagos

» Continue reading
Dev:

Diagraming Splunk’s data-flow (part 2 – performance overlays)

In my previous post “Diagraming Splunk’s data-flow” I wrote a small python script that parsed Splunk’s runtime environment ($SPLUNK_HOME/var/run/splunk/composite.xml) and generated a file which when input into graphviz would generate a nice architectural diagram of how pipelines and processors are wired together.

In this installment, I took it to the next level by using Splunk’s search capability to overlay performance metrics on the diagram. The combination of Splunk logging metrics information for each processor within each pipeline (thanks Brad) and the ability to have Splunk execute a search processor written in Python made this possible. Here is how you use it:

First download graphviz. I particularly like the OSX application that they’ve written because you can see the graph on the screen and…

» Continue reading