More fishbucket fun
For debugging files getting re-indexed, sometimes what I want to see can only be found in the fishbucket index of the affected instance. I can pick up and move an entire index (3.x+) and drop it into another instance, but when working with the fishbucket there are a couple other things to watch out for. I don’t want anything to change it once I put it in the new instance. So I set up a throwaway instance to easily make changes I wouldn’t want to do to a real one.
REALLY BIG WARNING
Don’t do this to any Splunk instance you like. You will be unhappy later. Throw away your dummy instance when you are done so you don’t confuse anybody.
Set…
What is this fishbucket thing?
It’s time for a little Indexing 101. If you look in the directory where your Splunk datastore resides (default location /opt/splunk/var/lib/splunk) you will find a directory called fishbucket. This index is not really intended for normal humans to investigate, more just Splunk engineers trying to decipher file input issues. It contains seek pointers and CRCs for the files you are indexing, so splunkd can tell if it has read them already. To see what’s there, try searching for “index=_thefishbucket”. Events look something like this:
48a304b3 initcrc::5f66db978a1ff3a3 seekcrc::bc96de428cc0b5e6 seekptr::414063 modtime::1218643123 filename::/var/log/apache2/feorlen_org_access_log source::/var/log/apache2/feorlen_org_access_log
The fields are:
timestamp (epoch time, in hex)
CRC of the first 256 bytes of the file
CRC of the 256 bytes where we were last reading
seek pointer for where we are in…














