24 Million CDC US Birth Records and Splunk #vitalstatsviz

Births vs. Mothers Age

The CDC – like most government bodies the world over – are starting to make more, and more data publicly available to advance research.

In January the CDC posted a blog post (since deleted) challenging the public to work with their Vital Stats datasets, including both birth and mortality data.

Over the coming weeks I’m going to post some of my findings (and workings) from analysing these datasets alongside other sources including weather and employment, all in Splunk.

In an optimistic mood I started with birth data.…

» Continue reading


Splunk Instagram

They say a picture is worth 1000 words. Actually it’s far more than that.

Take an Instagram image, there is tons of useful metadata behind the image – not just that tasty picture of what you had for dinner last night.

But how do you start to look at this data? I think you already know the answer to that! This post is just a quick guide showing you how to ingest and visualise Instagram data in Splunk.…

» Continue reading

Splunk the Vote: BBC Election Debate

This post is the first in a series analysing social data about the UK General Election 2015.

BBC leaders debate

The third official debate has come and gone – this time without Cameron and Clegg. Perhaps this is why we saw the fewest tweets (179,000) collected during the debate compared to the previous two debate (216,000 & 312,000).

But how did the two leaders compare to those in the five opposition leaders who took part in the debate?

In the third part of my #SplunkTheVote series I took to Splunk to find out.…

» Continue reading

Splunk the Vote: The ITV Leaders Debate

This post is the second in a series analysing social data about the UK General Election 2015.

Splunk the Vote ITV

Last night (02/04/2015) saw the second televised debate in the run up to the UK general election. Unlike the first, this debate saw all leaders from the 7 main political parties take part.

As we did with the first debate, we collected a sample of tweets to answer the most important question; who won?

The Data

We collected 312,000 tweets from around 123,000 unique users – so about 3 tweets per user in the 2 hour period. Tweet volume is almost double the first debate, however tweets per user is lower indicating more people were discussing this debate on Twitter.

Read more about

» Continue reading

Splunk the Vote: Battle For Number 10 – Cameron vs. Milband

This post is the first in a series analysing social data about the UK General Election 2015.

Splunk the Vote - Cameron v Miliband

On Thursday 7th May 2015 the UK will hold a General Election to vote for the next Prime Minister.

In the run up to the vote there is going to be a series of pledges, appearances, and debates. Over the coming weeks I am going to be collecting data from various sources of media into Splunk to provide some insight into how each of the main party is faring .

On Thursday 26th March the campaign kicked off proper with the first leaders “debate” (more like interviews) with the Prime Minister, David Cameron, pitted against the Leader of the Opposition, Ed Miliband.…

» Continue reading

Downhill Splunking (Part 1)

Splunk GPS

Last month I took some time off and hit the slopes in Jackson Hole, WY.

Yes, it was awesome. And yes, I want to be back there. However things need to be Splunked… starting with the data I collected whilst shredding the mountain.

I used an app called Ski Tracks to collect GPS data, and used Splunk examine it.…

» Continue reading

analytics.usa.gov Recreated Using Splunk


Have you guys seen analytics.usa.gov?

It’s a great break-down of web traffic to various US government sites. I’m a passionate believer in the open-data movement, and this is simply wonderful! A very big pat on the back to the US Government from across the other-side of the Atlantic. I’m now looking at you now Europe…

What’s more, the team that built the app have exposed API endpoints for the data that’s currently being displayed. Which – to my excitement – allows us to start playing with it in Splunk.

I wanted to show you just how easy it is to recreate the site in Splunk, and why you would want to do it in the first place.…

» Continue reading

git commit -a -m “Splunking Github Blog”

Github Splunk Analysis

I <3 Github. Splunk <3’s Github (check out our repos here). I am told it is just a coincidence our HQ is opposite theirs.

One of the neat things about Github I am just starting to explore is their API. You can use it to do loads of things, from interrogating user activity to searching for keywords within code. I recently saw this analysis of the most popular programming languages hosted on Github and I was inspired to recreate it within Splunk.

Indexing Github data into Splunk makes it super-simple to start exploring it. In this post I wanted to show you some of my first experiments connecting Splunk into the Github API.…

» Continue reading

#tbt: 5 of My Favourite Splunk Projects

Splunk Aircraft Monitoring

Not being one to look back at the past I usually hate the throwback Thursday hashtag.

That said, when you take a moment to look back and see some of the things our awesome customers are doing with Splunk there are occasions where I’ll consider it acceptable – this being one.

And with this justification, here are 5 of my favourite Splunk projects.…

» Continue reading

SMail: Splunking Your Inbox

Splunk GMail

Google sent me a nice message to start the year – “Your inbox is reaching its limit”.

Looking at my GMail inbox I have well over 70k emails, taking up just under 15GB of space. I’m interested in how this number is made up – who emails me the most, who I email, what time I’m most productive, etc.

I decided to download my GMail archive using Google Takeout to analyse the data. Here’s how I did it.…

» Continue reading