Downhill Splunking (Part 1)

Splunk GPS

Last month I took some time off and hit the slopes in Jackson Hole, WY.

Yes, it was awesome. And yes, I want to be back there. However things need to be Splunked… starting with the data I collected whilst shredding the mountain.

I used an app called Ski Tracks to collect GPS data, and used Splunk examine it.

Getting the data

You can export all your data from the Ski Tracks app in GPX format with a ~1 second resolution. It looks like this:

<time>2015-02-21T09:44:52.031-07:00</time></trkpt>
<trkpt lat="43.60124537" lon="-110.85425156">
<ele>2758.0</ele>
<time>2015-02-21T09:44:52.959-07:00</time></trkpt>
<trkpt lat="43.6012317" lon="-110.85424801">
<ele>2767.0</ele>

Splunk can handle this structure well with the time, lat and lon fields being identified automatically.

The <ele> field can also be extracted easily using the Splunk Field Extractor.

The Haversine App

In order to make sense of lat / long fields we need to calculate the distance between two points represented via latitude and longitude.

For example, if we wanted speed we would need to know distance (speed = distance / time).

Steven Maresca has built a SPL command, Haversine, that does just that. You can download it here.

The Basics

Using Haversine we can now calculate distance. Given the format of the GPX log we can calculate this as follows:

sourcetype=skitrx source="/Users/dgreenwood/Downloads/ski/*"
| eval latlon=lat.",".lon
| transaction maxspan=1h maxevents=2
| eval first_loc=mvindex(latlon, 0)
| eval second_loc=mvindex(latlon, 1)
| where first_loc!=second_loc
| haversine originField=first_loc second_loc

In essence, we’re cheating a bit via transaction to build an aggregate of 2 events. Then, we extract start/end locations from the multivalued field that transaction generates, exclude transactions with start location = end location, and then calculate distance with Haversine.

Interesting Searches

Top speed

The first thing I wanted know was, obviously, top speed.

<distance search>
| eval speed=distance/(duration/60/60)
| stats max(speed) AS maxSpeed
| eval "Max Speed kph"=round(maxSpeed,2)
| fields "Max Speed kph"

In the search above we do a simple speed calculation, and then use stats to find the max value.

Haversine calculates distance by default in Kilometers, hence Kilometers per hour is the output unit.

Keeping with speed, I then wanted to look at speed over time during the week – was I faster on a particular day?

<distance search>
| eval speed=distance/(duration/60/60)
| chart avg(speed) AS avgSpeed over date_minute by date_wday

In this search we again calculate speed and use a chart command to overlay each day on a single 24 hour time series graph (with minute granularity) so that we can easily compare them.

More data, more searches

In the second post I’m going to look at tackling more searches including:

  • the impact elevation has on speed
  • mapping routes using GPS data
  • adding weather data to look at the effect on distance travelled

Finally, I want to say a big thanks to Steven Maresca for his help to create these searches – thanks Steven!

Where are you converting meters into kilometers in your search? If the distance field is in meters, it looks like you are giving numbers in meters per hour not kilometers per hour.

March 27, 2015

Good spot! There was an error in the post – it should have read kilometers, not meters. Now fixed.

To confirm, the default unit of distance outputted using Haversine is the kilometer.

March 27, 2015

And Brian, if you’d like to output distance in miles rather than kilometers, Haversine will do so for you (just pass units=mi).

Steve Maresca
March 29, 2015