Deciphering dispatch directory names
Another confusing part of working with dispatch directories is how they are named. You can see the SID value (which is used as the directory name) in the search job inspector and it seems it has some meaningful information, but what is all that other stuff?
The dispatch directory name contains several elements, depending on the type of search. All include the time the search was run. In the case of a local ad-hoc search, that by itself is the entire dispatch directory name.
If it is from a saved search, the user requesting the search, the user context it is run as, and the app it came from are included. Searches from remote peers start with “remote”…
How long does my search live? Default search ttl
When talking about dispatch directories, it’s important to understand how long a search lives. After a search expires, its artifacts (contained in the dispatch directory) are deleted. Different types of searches have different default ttl values, counted from when the search completes. Here are some examples:
For a regular ad-hoc or saved search run manually, the default ttl is 10 minutes. A remote search from a peer is also 10 minutes.
Scheduled search ttl varies by the selected alert action, if any. If it has multiple actions, the ttl is that of the longest action. Without an action, the value is determined by dispatch.ttl in savedsearches.conf, which defaults to twice the schedule period.
Here are actions that affect…
A quick tour of a dispatch directory
Each search has artifacts that need to be saved on disk
This happens in $SPLUNK_HOME/var/run/splunk/dispatch. There is one directory for each search and it is deleted after the search expires.
Here’s the dispatch directory from a simple search from the UI. The name is the search id, for an ad-hoc search it is the epoch time of the search. (More on the relationship between sids and search names in another post.)
# pwd /Applications/splunk/var/run/splunk/dispatch # ls 1346978195.13/ args.txt generate_preview request.csv status.csv audited info.csv results.csv.gz timeline.csv buckets metadata.csv runtime.csv events peers.csv search.log
- args.txt – the arguments passed to the search process
- generate_preview – a flag to indicate this search has requested preview (mainly for UI searches)
Tracking indexing status in splunkd.log and metrics.log
To continue the discussion of internal logs, here are some examples of indexing-related activity in splunkd.log and metrics.log
This scripted input returned new events
09-03-2012 21:12:50.421 -0700 INFO ExecProcessor - Ran script: /Applications/splunk/splunk7000/etc/apps/unix/bin/iostat.sh, took 8.590 seconds to run, 660 bytes read
This file we already know about has more to read
09-02-2012 07:41:07.093 -0700 INFO WatchedFile - Will begin reading at offset=1228866737 for file='/var/log/apache2/spinnyspinny_access_log'.
This file rolled, so read the new one from the beginning
09-03-2012 00:00:22.310 -0700 INFO WatchedFile - Checksum for seekptr didn't match, will re-read entire file='/var/log/system.log'. 09-03-2012 00:00:22.310 -0700 INFO WatchedFile - Will begin reading at offset=0 for file='/var/log/system.log'.
This new file is one we already read before it rolled
Splunk internal logs: alerting
Here is what you will find if you go looking in Splunk’s internal logs when a scheduled search fires an alert. These actions don’t necessarily happen in exactly this order, but this is typically how I would go about finding evidence of them in the logs.
A regular saved search with an email alert
This is the alert in the savedsearches.conf
[alert1] action.email = 1 action.email.inline = 1 action.email.to = feorlen@feorlens-MacBook.local alert.suppress = 0 alert.track = 1 cron_schedule = */5 * * * * enableSched = 1 search = * | head 3
The search ran as user “admin” and now it tells splunkd to execute the actions. The sendemail search command gets the…
OMG a Blog Post!
It’s been forever since I’ve posted anything, but since I’ll be speaking at .conf2012 there is additional material we couldn’t get into our presentation. The blog is a great way to get that online. Come see Mathew and I talk about Splunk internal logging next week in Vegas!
So you want to write an app
With the previous setup, here’s what I want for my app:
A dashboard with a couple pretty pictures and some top N lists
Saved searches for advanced users to explore further
It should work for all my users with whatever indexes they have access to
I’m going to start with the sample_app template available in Manager and add what I want. Then I’ll clean up the sample stuff I don’t need. So the first step is to create a new app in Manager->Apps. Give it a name and an optional label and select “sample_app” as the template. I don’t have any additional files to upload now, so I’ll leave that alone. Save and I’m back to the…
List indexes on the main dashboard
If you are comfortable editing XML, here’s a handy hack to get the list of your default indexes in the “All indexed data” dashboard. It will show whatever the logged-in user has access to.
If you are using the standard dashboards from the Search app, do this:
Go to $SPLUNK_HOME/etc/apps/search/default/data/ui/views
Copy dashboard.xml to $SPLUNK_HOME/etc/apps/search/local/data/ui/views
Change the permissions on the file so you can edit it
Right before the last </view> tag at the end insert this XML:
<module name="HiddenSearch" layoutPanel="panel_row2_col1_grp4" group="All indexed data" autoRun="True"> <param name="search">| eventcount summarize=false index=* -count</param> <module name="SimpleResultsHeader"> <param name="entityName">results</param> <param name="headerFormat">Indexes (%(count)s)</param> <module name="Paginator"> <param name="count">20</param> <param name="entityName">results</param> <param name="maxPages">10</param> <module name="LinkList"> <param name="initialSortDir">desc</param> <param name="labelFieldSearch">*</param> <param name="valueField">count</param> <param name="labelField">index</param>
Getting started with 4.0 apps
I’ve been working on some apps for 4.0 and finally I can talk details. Over the next couple posts I’ll walk though creating a simple app using the new UI tools and a little XML. This is all based off the Apache logs on my server, so first a little background on how I’ve configured my 4.0 instance.
I have a typical small server whose primary purpose is to host a dozen or so low traffic websites. One site gets half my hits, three more most of the rest and the stragglers round out the lot attracting bots. Each virtual host has separate access_log and error_log files but all use the same format: access_common.
To take advantage of…
inputcsv to restrict a search by a list of field values
A customer asked about a complicated search that could be vastly simplified by using inputcsv to input a list of values from a file, a feature added for 3.3.x. It’s documented as an internal search command here:
We are talking about promoting it to public, so while it says unsupported it does work. Here’s how:
I’ve got events from my webserver for my new domain and I want to see what real hits it’s getting and not my own. They look like this:
22.214.171.124 - - [23/Oct/2008:01:42:21 -0700] "GET /category/admin/ HTTP/1.1" 200 5158 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
And I’ve gotten some traffic already:
$ ./splunk dispatch 'source=/var/log/apache2/mynewdomain_access_log | stats count'…