<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>

<channel>
	<title>Simeon's Blog</title>
	<atom:link href="http://blogs.splunk.com/simeon/?feed=rss2" rel="self" type="application/rss+xml" />
	<link>http://blogs.splunk.com/simeon</link>
	<description>Topics which are useful to any Splunk user</description>
	<pubDate>Mon, 03 Aug 2009 18:16:38 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.5.1</generator>
	<language>en</language>
			<item>
		<title>40 Days of 4.0:  Distributed searching</title>
		<link>http://blogs.splunk.com/simeon/?p=4</link>
		<comments>http://blogs.splunk.com/simeon/?p=4#comments</comments>
		<pubDate>Thu, 30 Jul 2009 20:09:06 +0000</pubDate>
		<dc:creator>simeon</dc:creator>
		
		<category><![CDATA[distributed search]]></category>

		<category><![CDATA[filter]]></category>

		<category><![CDATA[roles]]></category>

		<category><![CDATA[saved search]]></category>

		<guid isPermaLink="false">http://blogs.splunk.com/simeon/?p=4</guid>
		<description><![CDATA[If you are a long time enterprise user of the 3.x product, you may have become used to the pull-down menu for distributed searching.   One of the common use cases for this menu was searching specific indexers in your distributed search.   A common question was:  &#8220;Can we restrict the server via search syntax?&#8221;.   In [...]]]></description>
			<content:encoded><![CDATA[<p>If you are a long time enterprise user of the 3.x product, you may have become used to the pull-down menu for distributed searching.   One of the common use cases for this menu was searching specific indexers in your distributed search.   A common question was:  &#8220;Can we restrict the server via search syntax?&#8221;.   In the 3.3 and 3.4 product, you cannot restrict via syntax through the web interface.   There is a trick you can use via the command line, but that doesn&#8217;t help when you want to do this in a saved search.</p>
<p>In the 4.0 release, we have removed the pull-down menu and implemented indexer restrictions with search syntax. The new parameter is called &#8220;splunk_server&#8221;.   Let&#8217;s assume I have a distributed searcher (hostname=searcher1) and three indexers (hostname=indexer1, hostname=indexer2, and hostname=indexer3).  If I am searching for &#8220;error&#8221; and my goal is to restrict my searches to indexer3, I would use the following query:</p>
<p style="padding-left: 30px;"><strong>splunk_server=indexer3 error</strong></p>
<p>To search anything but indexer3 I would use:</p>
<p style="padding-left: 30px;"><strong>error NOT splunk_server=indexer3</strong></p>
<p>Using this restriction can be useful for tracking specific datacenters, monitoring server health, and securing data (can add this as a filter to a role).  For the complete documentation on this command, see our official documentation:</p>
<p>http://www.splunk.com/base/Documentation/latest/User/SpecifyMultipleServersToSearch</p>
<p>Note:   distributed searching is limited to the Splunk enterprise version.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.splunk.com/simeon/?feed=rss2&amp;p=4</wfw:commentRss>
		</item>
		<item>
		<title>Monitoring input files with a white list</title>
		<link>http://blogs.splunk.com/simeon/?p=3</link>
		<comments>http://blogs.splunk.com/simeon/?p=3#comments</comments>
		<pubDate>Thu, 09 Jul 2009 21:46:31 +0000</pubDate>
		<dc:creator>simeon</dc:creator>
		
		<category><![CDATA[blacklist]]></category>

		<category><![CDATA[inputs]]></category>

		<category><![CDATA[monitor]]></category>

		<category><![CDATA[whitelist]]></category>

		<guid isPermaLink="false">http://blogs.splunk.com/simeon/?p=3</guid>
		<description><![CDATA[There are many ways to feed data into Splunk.  One method is to monitor the files within a directory.  In the default &#8216;monitor&#8217; configuration, Splunk will try to index all files within a specified directory.  In some cases, you may have a directory which contains many files including some that you do [...]]]></description>
			<content:encoded><![CDATA[<p>There are many ways to feed data into Splunk.  One method is to monitor the files within a directory.  In the default &#8216;monitor&#8217; configuration, Splunk will try to index all files within a specified directory.  In some cases, you may have a directory which contains many files including some that you do not want to index.  Splunk can be configured to index specific file types as well as sub directories.  Here is a real-world working example of how to use a white list&#8230;</p>
<p>Let us assume we want to index certain compressed files (*.gz) where the file name starts with &#8220;200906&#8243;.  One of the filename&#8217;s is &#8220;20090631.gz&#8221;.  These files exist in a specific directory:  &#8220;/storage/datacenter/host1/webserver&#8221;.  To make things more interesting, I have other *.log files in that directory.  There are also other subdirectories within datacenter (such as host2, router1, router2).  I want to only index the &#8220;host&#8221; (host1 and host2) files and exclude any router files.   Additionally, there are appserver and system directories which reside under each host directory.  Conceptually, you want to do the following:</p>
<p style="padding-left: 30px;"><strong> * Tell Splunk to monitor the /storage/datacenter directory</strong><br />
<strong>* Set a whitelist for this input</strong><br />
<strong>* Edit the REGEX to match all files that contain &#8220;host&#8221; in the underlying path</strong><br />
<strong>* Edit the REGEX to match all files that contain &#8220;webserver&#8221; in the underlying path</strong><br />
<strong>* Edit the REGEX to match all files that start with &#8220;200906&#8243;</strong><br />
<strong>* Edit the REGEX to machh all files that end with &#8220;.gz&#8221;</strong></p>
<p>Your final stanza in the $SPLUNK_HOME/etc/system/local/inputs.conf file would resemble the following:</p>
<p><strong>[monitor:///storage/datacenter/]</strong><br />
<strong>sourcetype=gzfiles</strong><br />
<strong>_whitelist=host[^/]*/webserver/[^/]*200906[^/]*\.gz$</strong></p>
<p>The above stanza would index the following files:</p>
<p style="padding-left: 30px;">/storage/datacenter/host1/webserver/20090601.gz<br />
/storage/datacenter/host1/webserver/20090602.gz<br />
/storage/datacenter/host2/webserver/20090601.gz<br />
/storage/datacenter/host2/webserver/20090602.gz</p>
<p>The above stanza would NOT index the following files or directories:</p>
<p style="padding-left: 30px;">/storage/datacenter/logfile.txt<br />
/storage/datacenter/router1/logfile.log<br />
/storage/datacenter/host1/appserver/20090601.gz<br />
/storage/datacenter/host2/webserver/20090601.txt</p>
<p>The following doc was referenced and can be viewed for more details:  <a href="http://www.splunk.com/base/Documentation/latest/Admin/WhitelistAndBlacklistRules">http://www.splunk.com/base/Documentation/latest/Admin/WhitelistAndBlacklistRules</a></p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.splunk.com/simeon/?feed=rss2&amp;p=3</wfw:commentRss>
		</item>
		<item>
		<title>Splunk Dashboards outside of Splunk (part 2)</title>
		<link>http://blogs.splunk.com/simeon/?p=2</link>
		<comments>http://blogs.splunk.com/simeon/?p=2#comments</comments>
		<pubDate>Mon, 22 Jun 2009 22:29:52 +0000</pubDate>
		<dc:creator>simeon</dc:creator>
		
		<guid isPermaLink="false">http://blogs.splunk.com/simeon/?p=2</guid>
		<description><![CDATA[I recently blogged about a cool open source tool which is a Splunk Dashboard.  In less than an hour, you could easily bring up a central dashboard to visually oversee Splunk administration duties.  Here is a basic review of how to get the dashboard working, in combination with the Check Splunk tool.
Prerequesites:

spdash
checksplunk
crontab competency
ssh [...]]]></description>
			<content:encoded><![CDATA[<p>I recently blogged about a cool open source tool which is a <a href="http://blogs.splunk.com/simeon/?p=1">Splunk Dashboard</a>.  In less than an hour, you could easily bring up a central dashboard to visually oversee Splunk administration duties.  Here is a basic review of how to get the dashboard working, in combination with the Check Splunk tool.</p>
<p>Prerequesites:</p>
<ul>
<li><a href="http://www.ugu.com/sui/ugu/show?I=software.spdash">spdash</a></li>
<li><a href="http://www.ugu.com/sui/ugu/show?I=software.checksplunk">checksplunk</a></li>
<li>crontab competency</li>
<li>ssh competency</li>
<li>web server competency</li>
<li>cgi-bin competency</li>
</ul>
<p>Even if you are not very familiar with the above items, there is plenty of information available on the web to get things going.  The README files that come along with the tools are very useful and should be reviewed before proceeding.  The following steps are an outline of what I performed to get the dashboard working:</p>
<p>Step 1:  Install the spdash software on the web server host</p>
<ul>
<li> Installed onto my linux server splunkdemo1</li>
<li>Installation consisted of:  enabling the web server and placing the spdash scripts into the cgi-bin location</li>
<li>Runs on top of the OS installed apache web server from /var/www/cgi-bin/spdash</li>
<li> Runs on port 80</li>
<li>Edited the spdash script so that $STAT directory is located in /opt/demos/splunkdash/status</li>
<li>Create the above directory so that it contains ALL of the files used to compose spdash.   Logs, statistics, etc&#8230; are here</li>
</ul>
<p>Step 2:  Install the checksplunk software on the Splunk server</p>
<ul>
<li>Installed onto my linux server splunkdemo1</li>
<li>Installation consisted of:  placing the checksplunk script in it&#8217;s own directory, creating a directory to store results, and enabling a local crontab to run checksplunk on a regular interval (see step 3 for the example command)</li>
<li>OPTIONAL - Install checksplunk onto your other Splunk servers.  My example uses hosts located at 10.1.1.1 and 10.1.1.2)</li>
</ul>
<p>Step 3:  Retrieve the checksplunk data</p>
<ul>
<li>Setup a crontab on the web server host to retrieve the checksplunk data</li>
</ul>
<p>My crontab on splunkdemo1 is as follows:</p>
<pre>splunkdemo1&gt;crontab -l
*/5 * * * * /opt/demos/splunkdash/j2ee/checksplunk spdash
*/7 * * * * /opt/demos/splunkdash/email/checksplunk spdash
*/8 * * * * scp root@10.1.1.1:/opt/splunkdash/status/interop* /opt/demos/splunkdash/status/
*/6 * * * * scp root@10.1.1.2:/opt/splunkdash/status/cmdemo* /opt/demos/splunkdash/status/</pre>
<p>You will notice that I am running two remote secure copies and two local checksplunk commands.  The local checksplunks are configured to feed data to the /opt/demos/splunkdash/status directory.</p>
<p>Once you have checksplunk data feeding to the status directory, the cgi script should immediately pickup the data.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.splunk.com/simeon/?feed=rss2&amp;p=2</wfw:commentRss>
		</item>
		<item>
		<title>Splunk Dashboards outside of Splunk</title>
		<link>http://blogs.splunk.com/simeon/?p=1</link>
		<comments>http://blogs.splunk.com/simeon/?p=1#comments</comments>
		<pubDate>Thu, 21 May 2009 22:32:16 +0000</pubDate>
		<dc:creator>simeon</dc:creator>
		
		<category><![CDATA[dashboard]]></category>

		<category><![CDATA[splunk admin]]></category>

		<category><![CDATA[web tools]]></category>

		<guid isPermaLink="false">http://blogs.splunk.com/simeon/?p=1</guid>
		<description><![CDATA[I was recently given access to an open source tool called spdash.  This tool allows you to externally visualize Splunk health from an Administrative standpoint.  It consists of some cgi code and leverages a set of scripts (checksplunk) that grabs health information from one or more Splunk instances.    Information such as basic process [...]]]></description>
			<content:encoded><![CDATA[<p>I was recently given access to an open source tool called <a href="http://www.ugu.com/sui/ugu/show?I=software.spdash">spdash</a>.  This tool allows you to externally visualize Splunk health from an Administrative standpoint.  It consists of some cgi code and leverages a set of scripts (<a href="http://www.ugu.com/sui/ugu/show?I=software.checksplunk">checksplunk</a>) that grabs health information from one or more Splunk instances.    Information such as basic process status, listings of event counts, user specific search counts, and error messages are all presented in an intuitive screen.  Check out the main dashboard page:</p>
<p><span id="more-1"></span></p>
<p><img style="vertical-align: middle;" src="http://www.ugu.com/software/spdash/dash1.jpg" alt="spdash" width="1024" height="310" /></p>
<p><!--more--></p>
<p>After installing and running it internally on some of our systems, I have come away very impressed with what this can do for the System Administrator of a Splunk instance.   One of the great features is the server link which allows you to get specific server information.  Here is a screen capture of that screen:</p>
<p><!--more--></p>
<p><img src="http://www.ugu.com/software/spdash/dash2.jpg" alt="spdash drill down" width="1024" height="605" /></p>
<p><!--more--></p>
<p>When I first saw this being developed, I thought that it might be challenging to deploy.  After less than an hour, I had a handful of servers sending and updating data to this dashboard.  Now it&#8217;s no cakewalk, but it&#8217;s pretty straighforward.  If you are very familiar with Splunk, have scripting experience, and can manage cgi on a web server then you should have no trouble.   Kudos to the author, Kirk Waingrow, for making this available to the general public!  If you are a System Administrator and manage Splunk, I would highly recommend you check this out.</p>
<p>I will post a follow up that will contain details on my deployment&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.splunk.com/simeon/?feed=rss2&amp;p=1</wfw:commentRss>
		</item>
	</channel>
</rss>
