Sending binary data to Splunk and preprocessing it
A while ago I released an App on Splunkbase called Protocol Data Inputs (PDI) that allows you to send text or binary data to Splunk via many different protocols and dynamically apply pre processors to act on this data prior to indexing in Splunk. You can read more about it here.
I thought I’d just share this interesting use case that I was fiddling around with today. What if I wanted to send compressed data (which is a binary payload) to Splunk and index it ? Well , this is very trivial to accomplish with PDI.
Choose your protocol and binary data payload
PDI supports many different protocols , but for the purposes of this example I just rolled a dice and chose HTTP POST. I could have chosen raw TCP,UDP, SockJS or WebSockets and the steps in this blog for handling the binary data are the same.
Likewise for the binary payload. I just chose compressed Gzip data(could have chosen another compression algorithm) because more people can likely relate for the purposes of an example blog rather than using an example of an industry proprietary binary protocol like ISO8583 (financial services) or MATIP(aviation) or binary data encodings such as Avro or ProtoBuf.
Note , Splunk’s HTTP Event collector can also support a Gzip payload.
Setup a PDI stanza to listen for HTTP POST requests.
PDI has many options , but for this simple example you only need to choose the protocol and a port number.
Declare the custom handler to apply to the received compressed data (a binary payload).
You can see this above in the Custom Data Handler section. I’ve bundled this custom handler in with the PDI v1.2 release for convenience.Here is the source if you are interested. Handlers can be written in numerous JVM languages and then applied by simply declaring them in your PDI stanza as above and putting the code in the protocol_ta/bin/datahandlers directory, there are more template examples here.
The GZipHandler will intercept the compressed binary payload and decompress it into text for indexing in Splunk.
Send some test data to Splunk.
I just wrote a simple Python script to HTTP POST a compressed payload to Splunk.
Search for the data in Splunk.
I hope this simple example can get you thinking about unleashing all that valuable binary data you have and sending it to Splunk.
STOP THE PRESS – Bonus Appendix
Fire and Forget : send compressed JSON (a binary payload) over UDP to Splunk
Open a UDP port using PDI and declare to use the custom GZip handler
Write a simple test program to send some compressed JSON over UDP to this input
Search in Splunk !