How to: Splunk Analytics for Hadoop on Amazon EMR.
Using Amazon EMR and Splunk Analytics for Hadoop to explore, analyze and visualize machine data
Machine data can take many forms and comes from a variety of sources; system logs, application logs, service and system metrics, sensors data etc. In this step-by-step guide, you will learn how to build a big data solution for fast, interactive analysis of data stored in Amazon S3 or Hadoop. This hands-on guide is useful for solution architects, data analysts and developers.
This guide will see you:
- Setup an EMR cluster
- Setup a Splunk Analytics for Hadoop node
- Connect to data in your S3 buckets
- Explore, visualize and report on your data
You will need:
- An Amazon EMR Cluster
- A Splunk Analytics for Hadoop Instance
I’m sensor-ing that the fourth industrial revolution is going to be data driven
I was lucky enough attend the IoT World conference this week in Berlin. Everyone who is anyone in Industrial IoT and the associated software industry was present. The list of speakers included Bosch, GE and Vodafone among many others.
During the course of the two days at the event I had a conversation with a robot (see below), I visited a pre-war ballroom and I received a cocktail from two juggling bar tenders! However the most memorable moment came during the key note speech from Professor Whalster, one of the founders of Industry 4.0 movement – which is alternatively known as the fourth industrial revolution.
In simplistic terms, Industry 4.0 is focussed on the “smart factory” i.e the computerisation of manufacturing. …
Detect IoT anomalies and geospatial patterns for logistics insights
In part 1 of this blog series we spoke about how to turn sensor data into logistics insights. In this part we outline one approach for anomaly detection and enrich our sensor data with location information to discover geospatial patterns.
Anomalies? Find them with a few lines of SPL.
Anomaly detection can be tricky and implementations vary from simple thresholding and baselining to highly sophisticated approaches based on machine learning. In this example we leveraged the Splunk Machine Learning Toolkit to detect numeric outliers using a sliding window approach to check against multiples of the standard deviation in this time series to spot anomalies.
And that’s how the SPL looks like:
| timechart span=1s avg(ax) as avx avg(ay) as
Turn IoT sensor data into Operational Intelligence for logistics
The Internet of Things (IoT) wave may impact businesses and industry verticals differently but with the same potential: IoT opens new doors to interesting use cases that have immediate business impact and value. Splunk has delivered Operational Intelligence and Analytics in IT and Security for years, so why not apply Operational Intelligence and Analytics to IoT?
Referring to the general definition of IoT we consider an object that is connected to the internet, in our case data coming from a sensor which measures acceleration. One use case I want to walk through here is not new to logistics, but a great example to show the value in IoT. As the diagram above depicts the globalized delivery of goods takes place …
The Big Data Campaign Trail
Today Splunk announced the results of a Public Sector IT research project, The Big Data Campaign Trail – Every Data Point Matters for Better IT Operations.
Having been in IT Operations for a couple of decades now, I found the insights very interesting because I believe agencies and educational institutions are going through a fundamental digital transformation which is bringing them to a crossroads. As an architect who works with our public sector customers’ daily I want to share my thoughts on the results over a series of three blogs. This first one will highlight the major findings, while the following two will offer some insights based on my experiences that I hope you find helpful and actionable.
Writing Actionable Alerts
Is your Splunk environment spamming you? Do you have so many alerts that you no longer see through the noise? Do you fear that your Splunk is losing its purpose and value because users have no choice but to ignore it?
I’ve been there. I inherited a system like that. And what follows is an evolution of how I matured those alerts from spams to saviors.
Let it be known that Splunk does contain a number of awesome search commands to help with anomaly detection. If you enjoy what you read here, be sure to check them out since they may simplify similar efforts. http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Commandsbycategory#Find_anomalies
Stage 1: Messages of Concern
Some of the first alerts created are going to be searches …
My Splunk Origin Story
A World Without Splunk
In my pre-Splunk days, I spent significant time leading the vision for standards and automation in our company’s large distributed IBM WebSphere Network Deployment environment. Even though we used standard build tools and a mature change process, significant entropy and deviations were introduced into the environment as a product of requirements for tuning, business, infrastructure, security, and compliance.
As a result, we were unable to recognize the scope of impact when it came to security vulnerabilities or violations with 3rd party compliance. Even worse for us, we spent way too many staff-hours trying to replicate issues between production and quality assurance environments because we had no easy way to recognize the contributing configuration differences.
It’s a Bird, It’s a …
How’s my driving?
It was the summer of 2014. I was well into my big data addiction thanks to Splunk. I was looking for a fix anywhere: Splunk my home? Splunk my computer usage? Splunk my health? There were so many data points out there for me to Splunk but none of them would payoff like Splunking my driving…
At the time, my commute was rough. Roads with drastically changing speeds, backups at hills and merges, and ultimately way more stop and go than I could stomach. But how bad was my commute? Was I having as bad an impact on the environment as I feared? Was my fuel efficiency much worse than my quiet cruise-controlled trips between New York and Boston? …
How Brands Manage Data During the Super Bowl
You see servers and devices, apps and logs, traffic and clouds. We see data — everywhere. And with one of the world’s biggest sporting spectacles taking place just down the road from us in less than two weeks time, we thought we’d take a look at a few Splunk customers and how they’ve managed their data loads during the big game.
Coping with web traffic – Cars.com & Nissan
A few years ago, Cars.com used Splunk Enterprise to ensure its web environment could withstand the user load — and ensure a pleasant customer experience — during their Super Bowl advertisements. In years past Cars.com relied on aggregate data to determine their overall performance under Super Bowl levels of stress. …
Splunk wins “Big Data Innovation” at Computing’s Vendor Excellence Awards
It is always nice to end a working week on a high and last Friday gave the Splunk EMEA team a great start to the weekend. We were nominated and won Computing’s Vendor Excellence Award for “Big Data Innovation”. The judges commented specifically on Splunk’s ability to democratize big data so that everyone can use it.
It was a nice way to spend a Friday afternoon and there was a lot of nervous anticipation over lunch as to who was going to win the various awards.
The ceremony started with something I’d never seen before. The pre-award entertainment was 25 year old rapper comedian, Chris Turner (@ChrisPJTurner). Dressed in a very dapper suit he explained how he was going …