Achieving scale with the Kafka Modular Input

A hot topic in my inbox over the recent months has been how to achieve scalability with the Kafka Modular Input , primarily in terms of message throughput. I get a lot of emails from users and our own internal Splunk team about this , so rather than continuing to dish out the same replys , I thought I’d just pen a short blog to share some tips and tricks.

So let’s start off with this simple scenario :

  • a single instance of Splunk 6.3
  • downloaded and installed the freely available Kafka Modular Input from Splunkbase

These are the scaling steps that I would try in order.

Enable HTTP Event Collector output

With the recent release of Splunk 6.3 , I also updated the Kafka Modular Input to be able to output it’s received messages to Splunk via the new HTTP Event Collector. You can read more about this here.

  • Select the HEC option
  • Use HTTP (not HTTPs)
  • Enable Batch Mode. This will buffer events in memory until the batch buffer is flushed depending on how you tune the flush settings.You can tune the size of the batch buffer depending on the scale of the throughput in your Kafka environment ie: higher throughput => larger batch buffer will be more optimal.

Setup multiple consumer connections in the same consumer group

This is one area that many users are most likely not aware of. When you setup multiple Kafka stanzas in Splunk , these will actually run as multiple consumer threads inside of the same single JVM instance. And you can aggregate them into the same consumer group by setting the same Kafka Group ID.

Below are 3 Kafka consumer connection threads in the “my_test_group” consumer group running in the same JVM.

Screen Shot 2015-10-13 at 9.59.11 AM


Boost JVM Heap

If you are running many threads (stanzas) inside of a single JVM , then you may need to boost the JVM heap settings. This is easy to do.

From the documentation :

Screen Shot 2015-10-13 at 10.03.11 AM


Additional Kafka Consumer settings

For more advanced users , you can also set any of the full palette of Kafka consumer configuration parameters that you want. You just declare these as comma delimited key=value pairs.

Screen Shot 2015-10-13 at 10.08.14 AM


Going beyond a single instance of Splunk

If the above steps don’t warrant enough scale for you , then you can start to think about horizontal scalability.This is basically just the same steps that I mentioned above but replicated horizontally across (n) Splunk instances.

Screen Shot 2015-10-13 at 10.19.16 AM

Another solid post by the Maestro of Modular Inputs!

Damien, would you be able to elaborate on the scalability of the distributed model? Specifically, if deploying the modular input on a pool of UF, would admins need to take anything into considering to ensure they don’t duplicate inputs but also ensure no single points of failure? In other words, any “gotchas” around how ensure redundancy of the system but not redundancy of the data?

October 15, 2015

Zookeeper takes care of the HA/Failover aspect of things for you. Each Kafka connection from the Modular Input actually connects to Zookeeper and then Zookeeper manages the (n) Kafka Servers in the distributed Kafka cluster for you. Nice simple preso I just found online regarding Kafka and Zookeeper :

October 15, 2015

Great work here Damien

Jeff Champagne
October 16, 2015

Classic Damien. He’s got an excellent answer for everything 😉

Thanks man!

October 16, 2015