Custom Message Handling and HEC Timestamps with the Kafka Modular Input
Custom Message Handling
If you are a follower of any of my Modular Inputs on Splunkbase , you may see that I employ a similar design pattern across all of my offerings. That being the ability to declaratively plug in your own parameterizable custom message handler to act upon the raw received data in some manner before it gets output to Splunk for indexing. This affords many benefits :
- Many of my Modular Inputs are very cross cutting in terms of the numerous potential types and formats of data they will encounter once they are let loose in the wild. I can’t think of every data scenario. An extensibility design allows the user and community to be able to customize the data handling as they may require it by creating their own custom handlers , alleviating me from having to hard code logic for every data scenario into the Modular Input .
- Custom message handlers allow you to pre-process data , perhaps filtering out data you don’t require or performing some data pre computations.
- Custom formatting of the event that gets sent to Splunk for indexing such as transforming some gastly XML into a simpler JSON event.
- Handle non standard data types ie: binary data that you might receive over a messaging channel such as compressed or encrypted data or some proprietary binary protocol or charset encoding that Splunk can’t parse like (EBCDIC)
A couple of simple examples with the Kafka Modular Input
By default the Kafka Modular Input will use it own internal DefaultMessageHandler. This will just wrap Kafka messages in a KV string along with some other event meta-data.
I ship the Modular Input with 2 other message handlers that you can declaratively plug in to your config stanza (screenshots shown below) which are more oriented to JSON payloads being received from Kafka , a pretty common scenario in the Kafka world.
This will simply index the original raw event received from Kafka , such as a JSON string , with no additionally added meta fields.You can use this handler with STDOUT or HEC output channels.
This handler is designed to be applied when you are using the HEC output option and the received data from Kafka is a JSON string. The HEC payload format allows you to specify a “time” field that will get applied to your indexed event as the index time. So this handler allows you to declare which field in the JSON received from Kafka contains the time data , and this will be extracted and added into the HEC payload’s “time” field sent to Splunk for indexing.
So I hope these simple tips will come in handy for you and get you thinking about the augmentable capabilities of my Modular Inputs. If you need to create a custom message handler , start by reading the docs for the respective Modular Input and looking at source code examples on Github. And as always , reach out to me at anytime !