New in Hunk 6.2.1: Splunk Archiving & Searchable Archives!

  • Archive your existing Splunk indexer’s data with a Hunk 6.2.1
  • Search archived data in place from the Hunk search head
  • Documentation here!

Archive Splunk Data

Hunk 6.2.1 enables you to continuously archive your Splunk data to Hadoop, by pointing a Hunk search head to your Splunk indexers and configuring an new Archive Indexes.

Searching archived data

You can search archived data in place on Hadoop just as easily as you would search any other Splunk index. There’s no need to move data more than once. This works because Hunk already knows how to efficiently search data in Hadoop. We just had to archive the data in a file structure such that Hunk could efficiently prune the data by time.

Here’s how this works with a google docs image:
Hunk 6.2.1 Archiving
Diagram showing how the feature works.

Configuration

Most of the configuration goes on the Hunk search head, but you need Hadoop client libraries and Java on all nodes. For more information about configuration, see the documentation.

Copy, not delete

The archiver copies data to Hadoop. It does not delete data from the indexer, we let your Splunk index configuration take care of the deletion of data. Data can get copied from the Splunk indexer as soon as the data has reached a warm state. We copy warm and cold bucket data. You can configure how old you want your data to be before copying it to Hadoop.

What about precious network bandwidth?

We understand that network bandwidth is a limited resource in your cluster, so we’ve made it easy for you to configure network bandwidth throttling for the data transfers between your indexers to Hadoop. You can set limit the transfer rate to bits/second per indexer. See documentation.

Moar dashboards!

We’ve made it easy for you to monitor your newly setup archive system through a dashboard which is built upon the logs from the new archiver feature. To view the dashboard, on your Hunk search head, go to: “Settings -> Virtual Index -> Archived Indexes -> View Dashboards”archive dashboard cropped

We hope you will enjoy this new feature! Happy archiving!
– Petter

3 Trackbacks

  1. […] We’ve learned that Hunk can archive Splunk buckets to HDFS and S3. In this post we’ll see how we can use the new S3 integration introduced in Apache Hadoop 2.6.0, to get better performance and avoid the 5GB file size upload limit. […]

  2. Hunk: Size matters | Splunk Blogs on September 23, 2015

    […] One of the questions I am often asked is what is the difference in storage between Splunk Enterprise and Hunk on Hadoop using Hunk archiving.  Customers are trying to drive down TCO by storing historical data in Hadoop since it can run on low-cost commodity hardware. Hunk provides a simple mechanism to archive data from Splunk Enterprise into HDFS. Any data in warm, cold or frozen buckets can be archived and offloaded from Splunk instead of being deleted.  The best part of the archiving functionality is that as soon as the data is copied over to Hadoop it is available for searching from Hunk straight away using the same SPL language you know and love.  Here is a great blog post describing the technical details on how this works – Splunk archiving with Hunk […]

  3. […] and offloaded from Splunk instead of being deleted. For historical data, customers can easily search archived buckets in Hadoop instead of going through the thawing process in Splunk […]