Smart AnSwerS #31

Hey there community and welcome to the 31st installment of Smart AnSwerS.

An awesome event transpired last week in Splunk HQ’s courtyard that was hosted by Women in Tech @ Splunk. It was a panel featuring 5 of our very own Splunkers speaking on their background, the journey to Splunk, pivotal moments that helped steer their career path into the tech industry, and advice for breaking through barriers that women often face in the workplace. A  much needed reality check and discussion surfaced on the lack of diversity in tech from various underrepresented communities in regards to gender identity, sexual orientation, race, ethnicity, etc. This was just one of many possible opportunities to make positive change in the culture of the industry and our company. I’m hopeful and looking forward to more folks (myself included) contributing to these type efforts.

Check out this week’s featured Splunk Answers posts:

What do other users think of our retention policy solution using a nightly scheduled report to search and delete events older than 180 days?

krusty thought to handle the retention of data in his environment by scheduling a report at the end of the day to search for data older than 180 days and using the delete command. Some useful feedback was provided in the comments under the question, and lguinn brings all those points together in a nice comprehensive answer. She bears the news that krusty’s current retention plan is not best practice since using the delete command only removes data from being searchable and does not recover disk space. Instead, the route to go is configuring the indexes.conf setting frozenTimePeriodInSecs in combination with maxHotSpanSecs to make sure each bucket contains exactly 1 day of data and each bucket will roll to frozen after 180 days.

What happens in a distributed environment with auto load balancing after the min FreeSpace is reached on an indexer?

mrg2k8 was wondering what would happen if the minimum free disk space on an indexer was reached and auto load balancing was configured on the forwarders in a distributed search environment. Would the forwarders just switch forwarding to another indexer with no issue, or would data be lost during that process and time interval? bmacias84 answers by presenting a key setting to prevent data loss – enabling indexer acknowledgement on the forwarders. This would ensure forwarders will wait until an indexer has given the OK that all data was received, however, he also cautions that the TCPOUT queue setting may also need to be adjusted to avoid blocked data pipeline issues.

How do I search the count of how many times a keyword appears, not the event count?

PeterChu needed help constructing a search to count how many times a particular keyword appeared in his data, not the count of events. A similar scenario was featured in a previous Smart AnSwerS post, but martin_mueller creates a search on the raw data without using the mvexpand command to apply the solution to many events at once. With a little back and forth information gathering and helpful guidance, PeterChu’s problem was solved with a final clean and concise search worth keeping in your back pocket.

Thanks for reading!

Missed out on the first thirty Smart AnSwerS blog posts? Check ‘em out here!