Analyze Data with Hunk on Amazon EMR
In this post you will learn how to use Hunk to process data with an Amazon EMR cluster. We will go through the steps of:
- Creating a Hunk EC2 instance,
- Creating an Amazon EMR cluster
- Configure Hunk with EMR for the purposes of analyzing data in an S3 bucket.
Create a Hunk instance on AWS EC2.
The most convenient way to create an EC2 instance with Hunk is to use the Hunk AMI directly from AWS Marketplace (https://aws.amazon.com/marketplace/pp/B00GIZK2QI). The AMI is public and free to use, although typical EC2 hourly fees apply. It includes Hunk installed, the Hunk installer package (which will be needed later to distribute to DataNodes), Hadoop libraries, as well as Java – all in …
Two time-series, One Chart – Part Two
Following up on to my last post about plotting two time-series in one chart, I would like to talk about another related, larger topic; plotting multiple time-series on a single chart using a single search. Take for example the case of measuring and comparing values of a certain metric over multiple time ranges that are not adjacent to each other (as opposed to the last post were both series were adjacent; current hour vs. last hour, today vs. yesterday etc.)
Assume that in this example the metric of interest is average(responseTime) of a particular service that you’re offering. Further, assume that we would like to measure it over the last hour and compare it to the maximum of the …
Two time-series, One Chart (and One Search)
Plotting two time-series in a single chart is a question often asked by many of our customers and Answers users. Admittedly, given the many ways to manipulate data, there are several methods to achieve this . Most of them frequently use two searches – a main search and a subsearch with append – to pull target data over the adjacent timeranges that we’re interested on. Then, the _time field is manipulated to overlay both time graphs. While there is nothing wrong with this method, it is typically more efficient to use a single search instead.
I have created and I am sharing three macros to facilitate this. They paint two time-series graphs by using one search while manipulating the _time …