Hunk, HDFS, and Indexes

Update 9/27/16: As of Sept. 27, 2016, Hunk functionality has been incorporated into the Splunk Analytics for Hadoop Add-On and Splunk Enterprise versions 6.5 and later.

I’ve been asked a number of times why Hunk does not create a physical index like Splunk.

First, let me point out that your Hunk instance can search both physical and virtual indexes, allowing you to correlate data from disparate sources and stores within your farm without incurring the cost of duplication.

Now back to the question, which should really be: why can’t a physical index be created in HDFS?

HDFS is a non-POSIX filesystem. In layman’s term, a POSIX file-system is one that can be written to and read from in real-time. One …

» Continue reading

Hunk – Delivering Value to Your Business

Regardless of your title, if your job involves preparing data stored mainly in HDFS among other stores, so that your end-users can query and visualize it, Hunk is probably right for you.  Two common themes among data officers are:
1.  We are building a data lake.
2.  It takes too long to prepare the data.

So, you’ve built your data lake, now what?

If you’re using one of the many point solution tools, in order to gain insights from your data, you must first go through an ETL process. This requires expertise in a programming language, imposing structure on the data, loading it in Hive or a relational database, and using your favorite visualization tool. As the data …

» Continue reading

Using Flume to Sink Data to Splunk

If you have ever used Splunk, you can probably come up with a number of reasons why you should use a Splunk forwarder whenever possible to send data to Splunk. To quickly illustrate some of the benefits, a Splunk forwarder maintains an internal index of where it left off when sending data. If for some reason the Splunk Indexer has to be taken offline, the forwarder can resume its task after the indexer is brought back up. Additionally, a forwarder can automatically load balance traffic between multiple Splunk indexers. There’s already a Splunk blog here devoted to getting data into Splunk that highlights a forwarder’s benefits that I encourage you to review.

But what if using a Splunk Forwarder is …

» Continue reading