Digital Resilience Pays Off
Download this e-book to learn about the role of Digital Resilience across enterprises.
This week in “That happened: notes from #splunk”, a blog about the goings-on in the Splunk IRC channel: troubleshooting full disk with debug logging on, living with ADD, and enabling management pr0n.
BUT FIRST a message from our sponsors: We just released the code behind our custom-built Splunk documentation platform, Ponydocs, as open source! Check this blog post from Ashley, our WebDev manager, the docs I wrote for users of Ponydocs, and the GitHub repo. And now we return you to your regularly-scheduled IRC shenanigans:
Sometimes it’s hard to be the allmighty tallest^Wduckfez:
<duckfez> oh my aching head
<duckfez> colleague #1 – “app xyz errored out due to a disk full”
<duckfez> colleague #2 – “Well, that disk is full because app xyz has multiple threads doing runaway logging”
<duckfez> colleague #3 – “OK, I turned off everything but debug logging”
<troj> ahahahaha
<ftk> ftw
<Draineh> haha
ftk tells us to stop and smell the roses:
<^Brian^> anyone ever notice that servers boot faster if you aren’t consoled into them and watching the things go by
<duckfez> ^Brian^: and the chance to hit “F1” for setup is only 250ms long when you’re not looking
<^Brian^> duckfez: i had a hell of a time hitting f12 for the pxe boots
<^Brian^> i was like wtf, i can’t look away
<ftk> ^Brian^: why would you need to look away
<^Brian^> ftk: cause i’m too busy talking with my cow-orker
<ftk> what could be more important than being prepared to hammer ALT+R in ilo
<JPres> ftk: ADHD
<^Brian^> heh
<ftk> just turn of the twitters and facebooks for a minute
<ftk> it’ll be ok i promise
All in the service of management pr0n:
<edeca> I have data in a folder structure like: /blah/<host>/<type>/0001.txt – is it possible to make directory monitoring follow all /blah/*/<type>/*.txt easily?
<duckfez> edeca: sure, [monitor:///blah] whitelist=^/blah/[^/]+/type/.*\.txt
<edeca> Ah clever, I see what you did there!
<duckfez> (noting that splunk will recurse through all of /blah looking for files to match that white list)
<edeca> Cheers, that’s brilliant. And I can pick out hostname with a regex.
<edeca> s/a/another
<ziegfried> host_segment = 2 might be slightly more efficient
<duckfez> if you want the hostname in the path to be the “host=” of the event, then use host_segment=2
<duckfez> like ziegfried just said
<edeca> Nice. All this is too easy.
<edeca> I need to read up on the indexing, some stuff gets indexed twice if the data files are overwritten (even with identical data)
<duckfez> it’s the speed of the overwrite, likely
<edeca> Ah, because the CRC of the file end doesn’t match?
<duckfez> splunk sees, “well, I know this file used to be here, and it was X bytes.. now its way < X, and the end CRC doesn’t match … perhaps it got rolled?”
<edeca> Can I tell it that files will never get rolled?
<duckfez> that I’m not sure of … what are you doing, rsyncing into a tree?
<edeca> Which is my situation, as I control all the data splunk indexes (not using it for host monitoring, vis/searching of other event data)
<edeca> Another node does some data processing then moves across the output
<edeca> And sometimes (annoyingly) reprocesses files
<duckfez> make your moves atomic
<duckfez> eg, don’t use ‘mv’ across filesystems, and if you must, ‘mv’ to a temporary name first, then a second ‘mv’ to overwrite the first file atomically
<duckfez> from an atomic operations point of view, cp /foo/bar /baz/.bar.tmp && mv /baz/.bar.tmp /baz/bar ; is vastly superior to cp /foo/bar /baz/bar
<edeca> That makes sense, nice idea.
<edeca> Moving data across NFS links is how it works, so I like it.
<ziegfried> heh, thx
<edeca> Nice, I now have loads of extra data (>3 million in 20 seconds..) flowing into splunk with the host type auto recognised. Cheers guys!
<edeca> Time to draw some management pr0n^W^Wgraphs
----------------------------------------------------
Thanks!
rachel perkins
The Splunk platform removes the barriers between data and action, empowering observability, IT and security teams to ensure their organizations are secure, resilient and innovative.
Founded in 2003, Splunk is a global company — with over 7,500 employees, Splunkers have received over 1,020 patents to date and availability in 21 regions around the world — and offers an open, extensible data platform that supports shared data across any environment so that all teams in an organization can get end-to-end visibility, with context, for every interaction and business process. Build a strong data foundation with Splunk.