Splunking Social Media: Tracking Tweets
So you use Twitter and have heard Splunk can do “Big Data”. By tapping into Twitter’s API you can use Splunk to investigate the stream of tweets being generated across the globe.
The great thing about using Splunk to do this is that you have complete control of the data meaning it’s incredibly flexible as to what you can build. A few basic ideas I’ve had include tracking hashtags, following specific influencers, or tracking tweets by location in real-time.
What’s more, it takes a matter of minutes before you can start analysing the wealth of data being generated. This post will show you how.…
Splunking the World Cup 2014: Real Time Match Analysis
As an Englishman I’ve been waiting months – with very high expectations – for the World Cup to come around. Reading fellow Splunker, Matt Davies’ blog post titled, “Splunking World Cup 2014. The winner will be…“, only heightened my excitement.
The tournament is now going into the second week and I’ve been starting to look at the teams, players, and tournament more closely. Which stadium holds the most people? Who’s the top scorer? Which referee hands out the most cards?
With these questions fresh in my mind I opened up Splunk and began to have a look at the huge amounts of information being streamed from the tournament. For this post I’m going to explore real-time match updates; including teams, …
Exporting Large Results Sets to CSV
You want to get data out of Splunk. So you do the search you want and create the table you want in the search app. The results are hundreds of thousands of rows, which is good. So you click on the Export button and download the results to CSV. When you open the file, you see 50,000 rows. Is this a common problem? Not really. It’s a large enough result set that most people want to keep it in Splunk for analysis. However, there are times when such a large export is required. You really don’t want to log on to the Splunk server to get it either. So how do you progress?
I recently bumped into this problem myself …
I tend to travel quite a bit in my role at Splunk.The other day I was wondering to myself how far I had traveled in the last week , the last month , the last year. It just so happens that I am a Foursquare user , not because I like to hoard mayorships across the globe , rather I tend to use Foursquare checkins to help me remember where I have been.Now you get where I am gong with this , because “where have I been” actually means “a lot of cool location meta data” that I can have fun with.
I was looking around online for a simple tool that could hook into Foursquare to tell me how …
Getting data from your REST APIs into Splunk
More and more products,services and platforms these days are exposing their data and functionality via RESTful APIs.
REST really has emerged over previous architectural approaches as the defacto standard for building and exposing web APIs to enable third partys to hook into your data and functionality. It is simple , lightweight , platform independent,language interoperable and re-uses HTTP constructs. All good gravy. And of course , Splunk has it’s own REST API also.
The Data Potential
I see a world of data out there available via REST that can be brought into Splunk, correlated and enriched against your existing data, or used for entirely new uses cases that you might conceive of once you see what is available and …
Asking Vendors to Make Log Events Accessible
In my last blog entry, I wrote about asking vendors to make their log event formats follow industry best practices. Now, if the log events reside in files or can be broadcast out on network ports, this makes it quite easy to access them with technologies such as Splunk Universal Forwarders. However, if the log events are buried deep within the application, device, or system that created them, then there is is one more issue to address to get to the events and that is having an accessible transport mechanism with examples on usage.
By transport, I obviously am not referring to some futuristic vehicle transportation.
What I am talking about is a way for one computer process to …
Talk to Splunk from WordPress
I wrote a WordPress plugin (tested for 2.5.1) that displays my most recent Google search terms in my sidebar. It was an experiment with using the Splunk REST API and the PHP SDK.
You can configure the widget from the Widgets page and it supports multiple instances with different configuration. Right now the actual search string is hardcoded because I’m doing some extra mangling to get the search terms the way I want anyway, but I’ll be adding that to the configuration options also. Eventually there will be a way to cache results so you don’t do the search each time the page is loaded.
Since there is still work to do to make it more generic, I haven’t uploaded …
More frequent alerts with CLI dispatch
The saved search scheduler that the UI uses runs into trouble when you start running a bunch of searches at the same time. It kicks off one, waits for it to return or timeout and then moves on to the next. If the searches take more than a few seconds to run or there are dozens of them all with high frequency, it gets overloaded. One way to address this is to take advantage of the new dispatch (asynchronous search.) Dispatch is what is behind the REST API search functions and you can also get to it from the CLI with the “dispatch” command instead of the old “search.”
Old CLI search:
./splunk search “sourcetype=access_combined googlebot | stats count” -maxresults …