<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Splunk Blogs</title>
	<atom:link href="http://blogs.splunk.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://blogs.splunk.com</link>
	<description></description>
	<lastBuildDate>Wed, 23 May 2012 06:16:30 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.3</generator>
		<item>
		<title>That happened: episode 10</title>
		<link>http://blogs.splunk.com/2012/05/21/that-happened-episode-10/</link>
		<comments>http://blogs.splunk.com/2012/05/21/that-happened-episode-10/#comments</comments>
		<pubDate>Tue, 22 May 2012 00:46:31 +0000</pubDate>
		<dc:creator>rachel perkins</dc:creator>
				<category><![CDATA[Life at Splunk]]></category>
		<category><![CDATA[Tips & Tricks]]></category>

		<guid isPermaLink="false">http://blogs.splunk.com/?p=7996</guid>
		<description><![CDATA[<p>This week in &#8220;That happened: notes from #splunk&#8221;, a blog about the goings-on in the Splunk IRC channel: unfair karma practices, using your own supply, finding one-another at Splunk&#62; Live!, and sweet harmonies.</p>
<h2>Karma trickery</h2>
<p>Maybe all Drainy needs is a little friendly competition:</p>
<p>&#60;<strong>Drainy</strong>&#62; What! I just <a href="http://splunk-base.splunk.com/answers/">answered</a> someone&#8217;s question, they commented that it was the slashes, then they posted it as their own  answer and accepted that<br />
* <strong>Drainy </strong>waves fist<br />
&#60;<strong>kkolb</strong>&#62; Drainy: You just want to get ahead in the Karma race. (breathing down your neck)<br />
&#60;<strong>Drainy</strong>&#62; damn straight <img src='http://blogs.splunk.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /><br />
&#60;<strong>kkolb</strong>&#62; <img src='http://blogs.splunk.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /><br />
&#60;<strong>Drainy</strong>&#62; I&#8217;ve spent a good month off it, come back and all these&#8230;</p>]]></description>
			<content:encoded><![CDATA[<p>This week in &#8220;That happened: notes from #splunk&#8221;, a blog about the goings-on in the Splunk IRC channel: unfair karma practices, using your own supply, finding one-another at Splunk&gt; Live!, and sweet harmonies.</p>
<h2>Karma trickery</h2>
<p>Maybe all Drainy needs is a little friendly competition:</p>
<p>&lt;<strong>Drainy</strong>&gt; What! I just <a href="http://splunk-base.splunk.com/answers/">answered</a> someone&#8217;s question, they commented that it was the slashes, then they posted it as their own  answer and accepted that<br />
* <strong>Drainy </strong>waves fist<br />
&lt;<strong>kkolb</strong>&gt; Drainy: You just want to get ahead in the Karma race. (breathing down your neck)<br />
&lt;<strong>Drainy</strong>&gt; damn straight <img src='http://blogs.splunk.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /><br />
&lt;<strong>kkolb</strong>&gt; <img src='http://blogs.splunk.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /><br />
&lt;<strong>Drainy</strong>&gt; I&#8217;ve spent a good month off it, come back and all these young upstarts are right on my back<br />
&lt;<strong>Drainy</strong>&gt; fancy a race? See who has the highest karma come <a href="http://www.splunk.com/view/SP-CAAAGEF">.conf</a>?<br />
&lt;<strong>kkolb</strong>&gt; picking up speed  :-)<br />
&lt;<strong>kkolb</strong>&gt; ok. no vacation this summer.</p>
<h2>First one&#8217;s Free!</h2>
<p>mlanghor hooks another one:</p>
<p>&lt;<strong>mlanghor</strong>&gt; <a href="http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Rex">rex </a>to the rescue: sourcetype=networker_daemon host=folileg1 &#8220;There is already&#8221; peer | rex field=_raw &#8220;entry\sfor\s\&#8221;(?&lt;peer&gt;[^\"]+)\&#8221;" |<a href="http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Dedup">dedup</a> peer| <a href="http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Table">table</a> peer<br />
&lt;<strong>mlanghor</strong>&gt; had to help him create the rex string, but this EMC guy is loving Splunk.  he&#8217;s going to hate his next gig<br />
&lt;^<strong>Brian</strong>^&gt; hah..why?<br />
&lt;<strong>mlanghor</strong>&gt; &#8217;cause he won&#8217;t have Splunk for his Legato logs<br />
&lt;<strong>mlanghor</strong>&gt; back to tail, grep, etc.  this sucks &#8230;<br />
&lt;<strong>mlanghor</strong>&gt; ^Brian^: we&#8217;ve had an EMC guy here for a couple months now helping us with Legato backups, but he&#8217;s leaving next week. First day I gave him access to Splunk so we didn&#8217;t need to create an os account.  he loves it<br />
&lt;^<strong>Brian</strong>^&gt; mlanghor: nice..getting him addicted and then making him quit cold turkey</p>
<h2>Meeting in real life is complicated</h2>
<p>On the internet, <a href="http://www.flickr.com/photos/djpiebob/2304187726/">no one knows you&#8217;re not a bunny</a>:</p>
<p>&lt;<strong>Coccyx</strong>&gt; Nerf: I&#8217;m the guy the with the camera<br />
&lt;<strong>Coccyx</strong>&gt; that keeps getting up<br />
&lt;<strong>Coccyx</strong>&gt; and walking up to the front<br />
&lt;<strong>Coccyx</strong>&gt; I&#8217;m in a navy blue jacket, sitting in the back left right now<br />
&lt;<strong>Nerf</strong>&gt; I&#8217;m the guy in the middle with the bright green shirt<br />
&lt;<strong>Coccyx</strong>&gt; i can see a bright green shirt from here, that&#8217;s probably you<br />
&lt;<strong>Drainy</strong>&gt; Nerf: just to be safe, go buy a rose and put it in your mouth in case there are a couple of green shirts<br />
&lt;<strong>Nerf</strong>&gt; I&#8217;m 10 feet from the podium &#8211; Also the only one on IRC<br />
&lt;<strong>Nerf</strong>&gt; I go by &#8220;Chris&#8221; in public.  Gets fewer weird looks<br />
&lt;<strong>Coccyx</strong>&gt; I&#8217;m &#8220;Clint&#8221;, because Coccyx is hard to say</p>
<h2>Don&#8217;t scare the bosses</h2>
<p>pde speaks from experience:</p>
<p>&lt;<strong>pde</strong>&gt; protip: if you&#8217;re plotting a line called linear_trend, $management will become confused if the y axis is logarithmic<br />
&lt;<strong>pde</strong>&gt; . o O { <a href="http://i.imgur.com/lrHDX.jpg">the more you know</a> }</p>
<h2>Mister Splunkman, search me a dream</h2>
<p>There&#8217;s still plenty of time to get your (barbershop) act together for <a href="http://www.splunk.com/view/SP-CAAAGEF">userconf</a>:</p>
<p>&lt;<strong>DaGryph</strong>&gt; Morning!<br />
&lt;<strong>Drainy</strong>&gt; Morning!<br />
&lt;<strong>mattd</strong>&gt; Morning!<br />
&lt;<strong>Drainy</strong>&gt; <img src='http://blogs.splunk.com/wp-includes/images/smilies/icon_biggrin.gif' alt=':D' class='wp-smiley' />  what a pretty step effect<br />
&lt;<strong>Drainy</strong>&gt; fancy starting up a barbershop triplet?<br />
&lt;<strong>DaGryph</strong>&gt; Yeah!<br />
&lt;<strong>DaGryph</strong>&gt; We&#8217;d need a name&#8230;<br />
&lt;<strong>Drainy</strong>&gt; The Singing Splunkers, Da rain matt, Splunking your audio..<br />
&lt;<strong>DaGryph</strong>&gt; Quartet.conf<br />
&lt;<strong>Drainy</strong>&gt; The Universal Forwarders?<br />
&lt;<strong>Drainy</strong>&gt; haha<br />
&lt;<strong>DaGryph</strong>&gt; Lol.<br />
&lt;<strong>DaGryph</strong>&gt; UFs for short.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.splunk.com/2012/05/21/that-happened-episode-10/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Looking for apps that support a specific version of Splunk?  We&#8217;ve got you covered.</title>
		<link>http://blogs.splunk.com/2012/05/21/looking-for-apps-for-specific-version-of-splunk-weve-got-it-here/</link>
		<comments>http://blogs.splunk.com/2012/05/21/looking-for-apps-for-specific-version-of-splunk-weve-got-it-here/#comments</comments>
		<pubDate>Mon, 21 May 2012 23:25:58 +0000</pubDate>
		<dc:creator>Olexandr Prokhorenko</dc:creator>
				<category><![CDATA[Dev]]></category>
		<category><![CDATA[Tips & Tricks]]></category>
		<category><![CDATA[Splunkbase]]></category>

		<guid isPermaLink="false">http://blogs.splunk.com/?p=7992</guid>
		<description><![CDATA[<p>As the number of apps Splunkbase hosts continues to grow (260+ as of now), we’ve noticed that it has become common for people to search for apps that are compatible with a specific version of Splunk.  Sure, you could check the &#8220;Splunk compatibility&#8221; field on an app’s Details page and see whether a given app was compatible with your version of Splunk, but there was no easy way to look at all compatible apps.  As a result, we’ve decided to make our search smarter—we’ve improved it to support the Splunk version.</p>
<p>To try it out, type the Splunk version that you are looking for in the Search input box for apps (for example, 4.3) and hit Enter or click <a&#8230;</p>]]></description>
			<content:encoded><![CDATA[<p>As the number of apps Splunkbase hosts continues to grow (260+ as of now), we’ve noticed that it has become common for people to search for apps that are compatible with a specific version of Splunk.  Sure, you could check the &#8220;Splunk compatibility&#8221; field on an app’s Details page and see whether a given app was compatible with your version of Splunk, but there was no easy way to look at all compatible apps.  As a result, we’ve decided to make our search smarter—we’ve improved it to support the Splunk version.</p>
<p>To try it out, type the Splunk version that you are looking for in the Search input box for apps (for example, 4.3) and hit Enter or click <a href="http://splunk-base.splunk.com/apps/search/?q=4.3">Search</a>.</p>
<p style="text-align: center"><a href="http://blogs.splunk.com/wp-content/uploads/2012/05/Screen-shot-2012-05-21-at-12.01.21-PM.png"><img class="aligncenter size-full wp-image-7993" style="border: 1px solid #aaaaaa" src="http://blogs.splunk.com/wp-content/uploads/2012/05/Screen-shot-2012-05-21-at-12.01.21-PM.png" alt="" width="599" height="233" /></a></p>
<p>Splunkbase will understand what you are looking for and will provide you with list of Apps that are compatible with 4.3.  The results will appear on the right side of the screen, under &#8220;4.3 compatible apps&#8221; title.</p>
<p style="text-align: center"><a href="http://blogs.splunk.com/wp-content/uploads/2012/05/Screen-shot-2012-05-21-at-12.01.59-PM.png"><img class="aligncenter size-full wp-image-7994" style="border: 1px solid #aaaaaa" src="http://blogs.splunk.com/wp-content/uploads/2012/05/Screen-shot-2012-05-21-at-12.01.59-PM.png" alt="" width="581" height="215" /></a></p>
<p>Clicking on &#8220;<a href="http://splunk-base.splunk.com/apps/compatibility_version/4.3/">see all</a>&#8221; in the right bottom corner will let you see the full list of Apps that are compatible with Splunk version 4.3.</p>
<p>Try it out and <a href="mailto:op@splunk.com">let me know</a> how do you like it.  As always, you feedback is greatly appreciated.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.splunk.com/2012/05/21/looking-for-apps-for-specific-version-of-splunk-weve-got-it-here/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Splunk = Customer Satisfaction</title>
		<link>http://blogs.splunk.com/2012/05/18/splunk-customer-satisfaction/</link>
		<comments>http://blogs.splunk.com/2012/05/18/splunk-customer-satisfaction/#comments</comments>
		<pubDate>Sat, 19 May 2012 01:51:39 +0000</pubDate>
		<dc:creator>Shane Daniels</dc:creator>
				<category><![CDATA[Customers]]></category>
		<category><![CDATA[Where will your Data Take You?]]></category>

		<guid isPermaLink="false">http://blogs.splunk.com/?p=7964</guid>
		<description><![CDATA[<p>It is amazing to see the interesting ways that customers are using Splunk to enhance visibility into their business. Not only that, but just how quickly they are able to respond to new requests.  I was with a customer recently who is indexing approximately 1 Terabyte (TB) of data per day and are now fielding requests from all of their application teams.</p>
<p>A core part of their business is their billing system, which is comprised of many different solutions based on various acquisitions and spanning many geographies.  Their Service Oriented Architecture (SOA) is the glue that brings all of those systems together.  Any communication to the back end billing system must go through the SOA messaging layer.  For many years&#8230;</p>]]></description>
			<content:encoded><![CDATA[<p>It is amazing to see the interesting ways that customers are using Splunk to enhance visibility into their business. Not only that, but just how quickly they are able to respond to new requests.  I was with a customer recently who is indexing approximately 1 Terabyte (TB) of data per day and are now fielding requests from all of their application teams.</p>
<p>A core part of their business is their billing system, which is comprised of many different solutions based on various acquisitions and spanning many geographies.  Their Service Oriented Architecture (SOA) is the glue that brings all of those systems together.  Any communication to the back end billing system must go through the SOA messaging layer.  For many years the challenge was their lack of visibility into those messages, to understand how their core services were performing.  They had been evaluating other tools before Splunk caught their attention.</p>
<p>Splunk allows them to track transaction round trip time within their SOA environment. They now have a baseline of the environment and are focusing on improvements.  They have real time dashboard views and historical reports for the max, min and average transaction round trip times based on geography, and now get notifications when service levels degrade allowing them to act quickly.  Prior to Splunk, their notification mechanism was internal customers complaining of slow or dropped transactions. Their ability to quickly drill down to root cause and identify transaction issues has increased customer satisfaction internally and externally since the SOA  environment also supports self service actions on their customer facing website.</p>
<p><a href="http://blogs.splunk.com/wp-content/uploads/2012/05/RealTimeView1.png"><img src="http://blogs.splunk.com/wp-content/uploads/2012/05/RealTimeView1.png" alt="" width="900" height="588" class="size-full wp-image-7970" /></a><br />
<em><strong>Real Time View &#8211; Transactions by Geographic Location</strong></em></p>
<p>Their Customer Service Reps (CSRs) use a call center application to access information such as billing information and current services used by a customer.  This application is critical because it allows them to fulfill customer orders, upsell on new services and complete billing transactions.  When a call comes in, their Integrated Voice Response (IVR) system passes information to the call center application and the customer details automatically pop up on the screen.  Historically they have had issues with the consistency of screen pops and were getting frequent errors, which was hurting internal adoption of the tool and more importantly driving up call times and affecting customer satisfaction.  CSRs were forced to ask for the information again or access another system. We’ve all been there. I just gave the automated system my information, why are you asking me for it again!  For the offshore CSRs, if this occurred they might even have to reroute the call to another call center, which is a higher cost for the company. As CSRs called in to complain, the only option the application team had was to search through large log files manually on individual servers to find errors.  </p>
<p>Splunk now provides them with detailed metrics by geography of these application issues before having CSRs complain.  Once the application team realized how much better Splunk made their day to day, they created a specific log file for Splunk to analyze and help them significantly reduce issues with their call center application.  Even better, it took their Splunk admin less than a day to turn around the dashboard views requested once they provided the specific requirements.</p>
<p><a href="http://blogs.splunk.com/wp-content/uploads/2012/05/CSRScreenPop1.png"><img src="http://blogs.splunk.com/wp-content/uploads/2012/05/CSRScreenPop1.png" alt="" width="900" height="414" class="size-full wp-image-7971" /></a><br />
<em><strong>Successful CSR Screen Pops by Location</strong></em></p>
<p>My customer’s data is taking them to increased adoption and reliability of critical applications, complete visibility and understanding of their underlying SOA architecture and most importantly improving customer satisfaction.  Pretty Cool, Eh?  Up next they plan to look at some applications on Splunkbase to help in the areas of Web Intelligence and Microsoft Exchange.   </p>
<p>So the big question now is, where will your data take you? </p>
<p>Let’s find out at .conf2012. </p>
<p>Register today: http://www.splunk.com/goto/conf</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.splunk.com/2012/05/18/splunk-customer-satisfaction/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Analytics Staffing for Big Data: A Perspective</title>
		<link>http://blogs.splunk.com/2012/05/16/analytics-staffing-for-big-data/</link>
		<comments>http://blogs.splunk.com/2012/05/16/analytics-staffing-for-big-data/#comments</comments>
		<pubDate>Wed, 16 May 2012 16:13:00 +0000</pubDate>
		<dc:creator>Rahul Deshmukh</dc:creator>
				<category><![CDATA[Customers]]></category>
		<category><![CDATA[Where will your Data Take You?]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[data analyst]]></category>
		<category><![CDATA[data scientist]]></category>
		<category><![CDATA[hadoop]]></category>
		<category><![CDATA[web intelligence]]></category>

		<guid isPermaLink="false">http://blogs.splunk.com/?p=7814</guid>
		<description><![CDATA[<p>Couple of weeks ago, we talked about the need to appropriately <a href="http://blogs.splunk.com/2012/05/03/i-invested-in-a-shinny-new-tooltechnology/" target="_blank">invest in people</a>, when you invest in technology.  I wanted to continue the discussion and focus on the new area of &#8220;Big Data&#8221; &#8211; more specifically the analyst who works on big data &#8211; the &#8220;Data Scientist&#8221; and the data analyst.</p>
<p>I love the term &#8220;data scientist&#8221;.  It has finally made the data junkie&#8217;s job title more glamorous.  It has given both name and fame to the role.  Well everyone is talking about “big data”.  Many organizations think  hiring a data scientist is requirement for solving all &#8220;big data&#8221; problems and the only analyst required with a big data problem are data scientist. If you have invested in&#8230;</p>]]></description>
			<content:encoded><![CDATA[<p>Couple of weeks ago, we talked about the need to appropriately <a href="http://blogs.splunk.com/2012/05/03/i-invested-in-a-shinny-new-tooltechnology/" target="_blank">invest in people</a>, when you invest in technology.  I wanted to continue the discussion and focus on the new area of &#8220;Big Data&#8221; &#8211; more specifically the analyst who works on big data &#8211; the &#8220;Data Scientist&#8221; and the data analyst.</p>
<p>I love the term &#8220;data scientist&#8221;.  It has finally made the data junkie&#8217;s job title more glamorous.  It has given both name and fame to the role.  Well everyone is talking about “big data”.  Many organizations think  hiring a data scientist is requirement for solving all &#8220;big data&#8221; problems and the only analyst required with a big data problem are data scientist. If you have invested in big data (Hadoop, Splunk etc.), do you need a data scientist?  My goal for this post is to dive in a bit deeper and help you understand, as well as make the right choices. Being a web analytics practitioner for number of years and having experienced the journey from being an analyst to managing analytics teams at companies like <a href="http://www.ebay.com" target="_blank">eBay</a>, I would like to share my experiences and hope you will benefit from this discussion.  This post aims at addressing all types of datasets &#8211; small, big, huge. I am going to focus on online business, as I have a better understanding of online than other areas where big data is used:</p>
<p>1)   Online platform companies:  Online platform companies thrive on great products.  These products mostly involve building compelling interfaces that are mostly enabled by data.  The &#8220;apps&#8221; or modules within the sites use mathematical models or algorithms to drive user engagement or stickiness for those modules/apps.</p>
<p>2)    Online channel business: eC0mmerce or content sites rely on deep understanding of data to drive user engagement and product optimization &#8211; ultimately driving higher conversion on the site, user engagement and revenues from the online channel. Optimizing user acquisition and retention is also very important goal for these organizations.</p>
<p>Successful organizations thrive for the ability to embed data in the products, decision making process, and drive optimization across the online properties.</p>
<p>Before we get started, let’s define a data scientist.  A simple explanation from <a href="http://www.linkedin.com/in/dpatil" target="_blank">DJ Patil</a> who co-invented the term:</p>
<p><em>&#8220;A data scientist is that unique blend of skills that can both unlock the insights of data and tell a fantastic story via the data.&#8221;</em></p>
<p>While a bit comprehensive is from  <a href="http://jakeporway.com/"><strong>Jake Porway</strong>, Data without Borders and the New York Times</a><em><br />
&#8220;A data scientist is a rare hybrid, a computer scientist with the programming abilities to build software to scrape, combine, and manage data from a variety of sources and a statistican who knows how to derive insights from the information within. S/he combines the skills to create new protoypes with the creativity and thoroughness to ask and answer the deepest questions about the data and what secrets it holds.&#8221;</em></p>
<p>Many of the conversations on social media sites and job descriptions lean towards an understanding that a data scientist is a good analyst, is not afraid to deal with data, brings new perspectives and combines analytics with statistics – building algorithm and data products.</p>
<p>So do you need a “data scientist” for every “big data” problem?  Not really.  The algorithm, data mining or advanced statistical modeling pieces represent 10-15% of all analytics needs within the organization.  There are many important analytics – product optimization, site testing, user experience optimization or measuring online channel performance that most organizations need to focus from an analytics standpoint.  Skills needed for these types of analysis rarely need algorithm development or advanced statistical skills.  Mostly, data scientists work on futuristic products; data or web analyst work on current product – measuring the effectiveness of site or user in real-time and correlated with various data sources to optimize the business.</p>
<p>From a skill set standpoint &#8211; Data Scientist need strong data skills, analysis skills, strong knowledge of statistics and ability to program algorithms.  A data/business/web analyst on the other hand is not expected to having programming skills to build algorithms, but needs strong SQL skills in addition to good understanding of analytics packages.  Both of them need to be passionate about data and have a high level of curiosity &#8211; often questioning the data to derive new insights from the data&#8230;.nothing short of a Data Ninja!  Lastly, every analyst needs to be able to tell and sell his &#8220;story&#8221; from the insights.</p>
<p><a href="http://blogs.splunk.com/wp-content/uploads/2012/05/data_skills1.jpg"><img class="alignnone size-full wp-image-7951" src="http://blogs.splunk.com/wp-content/uploads/2012/05/data_skills1.jpg" alt="" width="448" height="344" /></a></p>
<p>A good approach to your “big data” analytics staffing plan is a good 80/20 rule. Staff 80% of your resources in data/business/web analyst and 20% on data scientist.  You will also be able to create a carrer path for your best data analyst to become a data scientist.  As an organization, the best bet is to provides tools and technology that will reduce the data movement, manipulation and data acquisition effort.  This will allow the data scientist to focus on the value added analysis that can move the needle for the business.</p>
<p>I will leave you on a simple analysis for available jobs in US in the big data or analytics space. Clearly jobs for “big data” and Hadoop dominate the space, Data Scientist roles are few (right now), but over time it will increase. The chart below is from available jobs in US posted in Linkedin.</p>
<p><a href="http://blogs.splunk.com/wp-content/uploads/2012/05/big_data_jobs.jpg"><img class="alignnone size-full wp-image-7816" src="http://blogs.splunk.com/wp-content/uploads/2012/05/big_data_jobs.jpg" alt="" width="612" height="295" /></a></p>
<p>I hope this post has provided some ideas on how to approach the human side of analytics for big data problems.  Did I mention that this and other interesting discussions will happen at <a href="http://www.splunk.com/view/SP-CAAAGEF" target="_blank">Splunk&#8217;s 2012 User Conference</a>?  Come and enjoy the data journey <img src='http://blogs.splunk.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>PS:  A good in-depth read on building a data scientist team is <a title="O'Riley Radar" href="http://radar.oreilly.com/2011/09/building-data-science-teams.html" target="_blank">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.splunk.com/2012/05/16/analytics-staffing-for-big-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Dallas Splunk Users Group &#8211; June 12th @ 6:00p CST</title>
		<link>http://blogs.splunk.com/2012/05/13/dallas-splunk-users-group-june-12th-600p-cst/</link>
		<comments>http://blogs.splunk.com/2012/05/13/dallas-splunk-users-group-june-12th-600p-cst/#comments</comments>
		<pubDate>Mon, 14 May 2012 01:54:52 +0000</pubDate>
		<dc:creator>Maverick</dc:creator>
				<category><![CDATA[Customers]]></category>
		<category><![CDATA[SplunkNews]]></category>
		<category><![CDATA[Tips & Tricks]]></category>
		<category><![CDATA[dallas]]></category>
		<category><![CDATA[fort worth]]></category>
		<category><![CDATA[plano]]></category>
		<category><![CDATA[splunk]]></category>
		<category><![CDATA[Splunk User events]]></category>
		<category><![CDATA[Texas]]></category>
		<category><![CDATA[user groups]]></category>

		<guid isPermaLink="false">http://blogs.splunk.com/?p=7919</guid>
		<description><![CDATA[On the second Tuesday of each month, Splunkers in the Dallas / Fort Worth Metroplex area have been getting together on a regular basis to talk about all things Splunk. Seems the users are able to take advantage of spending just a couple hours with each other, trading notes about Splunk, helping each other solve problems with our Splunk deployments and configurations, and sharing a beer and pizza too.

BTW, we are 40 members and counting now!

Our next meeting will be held at the Splunk Office in Plano, Texas on Tuesday, June 12th @ 6:00p CST.]]></description>
			<content:encoded><![CDATA[<p>On the second Tuesday of each month, Splunkers in the Dallas / Fort Worth Metroplex area have been getting together on a regular basis to talk about all things Splunk. Seems the users are able to take advantage of spending just a couple hours with each other, trading notes about Splunk, helping each other solve problems with our Splunk deployments and configurations, and sharing a beer and pizza too.</p>
<p>BTW, we are 40 members and counting now!</p>
<p>Our next meeting will be held at the Splunk Office in Plano, Texas on Tuesday, June 12th @ 6:00p CST.</p>
<p>If you are interested in attending now, please click this link below for details:</p>
<p align="center">
<a href="http://www.meetup.com/Splunk/Plano-TX/698002/">http://www.meetup.com/Splunk/Plano-TX</a>
</p>
<p>Our last meeting was May 8th and attendees shared some of their more interesting searches and reports as well as some of the not-so-well-known search commands they are using lately.</p>
<p>I look forward to hearing about your various war stories regarding Splunk. How you work through issues, figure things out, extend/expand your use and, more importantly, your thinking about Splunk. It&#8217;s quite an eye-opening experience for a veteran Splunker like myself to learn from you guys and I&#8217;m never short of amazed at the creativity that you demonstrate as you leverage Splunk for all kinds of IT problems, apply advanced analytics and correlations now in ways that are actually helpful for a change. </p>
<p>Also, Paul Sanford from our Seattle Splunk office will be in town and will join the meeting to listen in on the discussions. Perhaps we can ask him to show us some of the latest Splunk Dev projects he&#8217;s got going. </p>
<p>In any case, I&#8217;m happy that you want to get together now on a regular basis and I can&#8217;t wait until 6/12/12. See you there!</p>
<p>BTW, I created a Dallas Splunk Users Group Home and Notes page, which can be found here:</p>
<p><a href="http://wiki.splunk.com/SplunkDallasUsersGroup">Splunk Dallas Users Group Home</a><br />
<a href="http://wiki.splunk.com/Talk:SplunkDallasUsersGroup">Splunk Dallas Users Group Meeting Notes</a></p>
<p>I also created a Google Group as well, which can be found here:</p>
<p><a href="http://groups.google.com/group/splunkdallas">Dallas Splunkers Google Group</a></p>
<p>Sign up and come join us, if you want (dare)!</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.splunk.com/2012/05/13/dallas-splunk-users-group-june-12th-600p-cst/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>#SplunkGovt Twitter Chat: A Sneak Peak at What We&#8217;ll Explore at SplunkLIVE! Washington, D.C.</title>
		<link>http://blogs.splunk.com/2012/05/11/splunkgovt-twitter-chat-a-sneak-peak-at-what-well-explore-at-splunklive-washington-d-c/</link>
		<comments>http://blogs.splunk.com/2012/05/11/splunkgovt-twitter-chat-a-sneak-peak-at-what-well-explore-at-splunklive-washington-d-c/#comments</comments>
		<pubDate>Fri, 11 May 2012 17:53:28 +0000</pubDate>
		<dc:creator>Paul Wilke</dc:creator>
				<category><![CDATA[SplunkNews]]></category>

		<guid isPermaLink="false">http://blogs.splunk.com/?p=7908</guid>
		<description><![CDATA[<p>If the White House’s recent <a href="http://www.whitehouse.gov/blog/2012/03/29/big-data-big-deal">Big Data Research and Development Initiative</a> is any indication, big data is a big deal for government. However, collecting, analyzing and reacting to large amounts of machine-generated data can prove to be challenging for agencies</p>
<p>Yesterday we teamed up with Bob Gourley from <a href="http://ctovision.com/tag/bob-gourley/">CTO Vision</a> to host a Twitter chat on how government can make sense of it all. From data analysis for operational intelligence to log management for cyber defense, we covered a number of ways agencies can make the most of their data. Here are a few key takeaways from the discussion</p>
<ul>
<li><strong>Determine how to deal with the data explosion.</strong> One of the most significant barriers to harnessing big data</li></ul><p>&#8230;</p>]]></description>
			<content:encoded><![CDATA[<p>If the White House’s recent <a href="http://www.whitehouse.gov/blog/2012/03/29/big-data-big-deal">Big Data Research and Development Initiative</a> is any indication, big data is a big deal for government. However, collecting, analyzing and reacting to large amounts of machine-generated data can prove to be challenging for agencies</p>
<p>Yesterday we teamed up with Bob Gourley from <a href="http://ctovision.com/tag/bob-gourley/">CTO Vision</a> to host a Twitter chat on how government can make sense of it all. From data analysis for operational intelligence to log management for cyber defense, we covered a number of ways agencies can make the most of their data. Here are a few key takeaways from the discussion</p>
<ul>
<li><strong>Determine how to deal with the data explosion.</strong> One of the most significant barriers to harnessing big data in government is the challenge of keeping up with the growth of data and its increasing complexity. Federal IT managers need to automate big data management with the right analysis tools.</li>
</ul>
<ul>
<li><strong>Focus on the next cyber threat &#8211; don’t chase the last one.</strong> Analyzing big data provides agencies with the operational intelligence to proactively defend against cyber threats and meet stringent cyber-security compliance standards.</li>
</ul>
<ul>
<li><strong>Defend your ROI.</strong> In a sluggish economy, making the case to invest in big data technology can be a federal IT manager’s worst nightmare. The key is to prove the value of your investment with use cases.</li>
</ul>
<p>To hear more about big data for government, join us at <a href="http://live.splunk.com/forms/SL_WashingtonDC_May2012">SplunkLIVE! DC</a> on May 15. Don’t worry—if you’re not in DC, you can still participate during our live webcast <a href="https://event.on24.com/eventRegistration/EventLobbyServlet?target=registration.jsp&amp;eventid=460599&amp;sessionid=1&amp;key=245A2A68C419E9B34156D71FE35DC841&amp;sourcepage=register">here</a>.</p>
<p>Check out the discussion below. Looking forward to continuing the conversation!</p>
<p><a href="https://twitter.com/#!/search/%23SplunkGovt"><img class="alignnone size-full wp-image-7909" src="http://blogs.splunk.com/wp-content/uploads/2012/05/DC.png" alt="" width="508" height="386" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.splunk.com/2012/05/11/splunkgovt-twitter-chat-a-sneak-peak-at-what-well-explore-at-splunklive-washington-d-c/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Doing More With What You Have</title>
		<link>http://blogs.splunk.com/2012/05/11/doing-more-with-what-you-have/</link>
		<comments>http://blogs.splunk.com/2012/05/11/doing-more-with-what-you-have/#comments</comments>
		<pubDate>Fri, 11 May 2012 17:23:06 +0000</pubDate>
		<dc:creator>Chris Bauer</dc:creator>
				<category><![CDATA[Tips & Tricks]]></category>

		<guid isPermaLink="false">http://blogs.splunk.com/?p=7830</guid>
		<description><![CDATA[<p>How many times have you been challenged by your management with the following adages?</p>
<p><em>“You have to do more with less.”</em></p>
<p><em> </em></p>
<p><em>“Congratulations on staying under budget. We’re cutting your funding by 15% this year. You’re welcome.”</em></p>
<p><em>“Wow. This dashboard looks great! I want every VP in the company to have something like this. By tomorrow morning.”</em></p>
<p><em> </em></p>
<p>Dilbert jokes aside, this happens every day to our customers. They invest the requisite time to learn Splunk, enthusiastically win over additional lines of business, and continually strive to innovate new and better methods of getting work done.</p>
<p>But most customers tend to hit a plateau of sorts with Splunk.</p>
<p>The fires are extinguished, automated alerts provide <em>some</em> proactive capabilities&#8230;</p>]]></description>
			<content:encoded><![CDATA[<p>How many times have you been challenged by your management with the following adages?</p>
<p><em>“You have to do more with less.”</em></p>
<p><em> </em></p>
<p><em>“Congratulations on staying under budget. We’re cutting your funding by 15% this year. You’re welcome.”</em></p>
<p><em>“Wow. This dashboard looks great! I want every VP in the company to have something like this. By tomorrow morning.”</em></p>
<p><em> </em></p>
<p>Dilbert jokes aside, this happens every day to our customers. They invest the requisite time to learn Splunk, enthusiastically win over additional lines of business, and continually strive to innovate new and better methods of getting work done.</p>
<p>But most customers tend to hit a plateau of sorts with Splunk.</p>
<p>The fires are extinguished, automated alerts provide <em>some</em> proactive capabilities and management is delighted with the superior visualizations and reports they receive. Your users are very satisfied with their ability to search effortlessly through terabytes of data for the needle in the haystack.</p>
<p>So what’s next? Are you ready for the next leap forward? What are the areas of greatest benefit to focus on?  What could you be doing differently/better to prepare for the next phase of evolution?</p>
<p>You need a <strong>Splunk Value Check</strong>.</p>
<p>The Value Check is a 1-2 day workshop designed to maximize the value you’re currently getting from your Splunk investment. It is a joint exercise between your organization and technical specialists from Splunk with several goals in mind. Benefits include;</p>
<ul>
<li>Ensuring your architecture is supportable, scalable, and upgradable</li>
<li>Identification of performance concerns and risks</li>
<li>Knowledge transfer of Best Practices</li>
<li>Optimization of your daily volume consumption</li>
<li>Benchmarking your relative Splunk maturity</li>
</ul>
<p>The process to get the ball rolling is straightforward.</p>
<p><strong>1.</strong> Contact your Splunk Sales Representative and express your interest in the program.</p>
<p><strong>2</strong>. You will receive a Value Check Assessment Form to capture and notate your environmental data. Complete this template and identify relevant participants from your organization.</p>
<p><strong>3.</strong> Splunk will conduct the 1-2 day workshop. Through a series of interviews Splunk will gather the required information needed for any recommendations.</p>
<p><strong>4.</strong> When completed, Splunk will review the results with you and your team. Deliverables include;</p>
<ul>
<li>Splunk Maturity Model Scorecard</li>
<li>Environmental Summary</li>
<li>Data Source Summary</li>
<li>Use Cases Summary</li>
<li>Recommendations</li>
</ul>
<p>Armed with this information you will be much better prepared to take full advantage of your Splunk environment and provide even greater value to your organization as a whole.</p>
<p><strong><em> </em></strong></p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.splunk.com/2012/05/11/doing-more-with-what-you-have/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>That happened: episode 9</title>
		<link>http://blogs.splunk.com/2012/05/10/that-happened-episode-9/</link>
		<comments>http://blogs.splunk.com/2012/05/10/that-happened-episode-9/#comments</comments>
		<pubDate>Fri, 11 May 2012 00:36:49 +0000</pubDate>
		<dc:creator>rachel perkins</dc:creator>
				<category><![CDATA[Life at Splunk]]></category>
		<category><![CDATA[Tips & Tricks]]></category>

		<guid isPermaLink="false">http://blogs.splunk.com/?p=7889</guid>
		<description><![CDATA[<p>This week in &#8220;That happened: notes from #splunk&#8221;, a blog about the goings-on in the Splunk IRC channel: slow learners, how not to get dizzy when configuring props and transforms, bureacracy in action, and <a href="http://i.imgur.com/esZ3e.jpg">Good Guy Splunk</a>:</p>
<h2>If you build it, they will (eventually) come</h2>
<p>(But you might have to disable their ssh access to the production hosts first):</p>
<p>&#60;<strong>mlanghor</strong>&#62; ahh, the joy in your co-worker coming by with advanced Splunk questions, &#8220;how can I use that <a href="http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Rex">rex</a> command you talked about a few weeks ago to extract something?&#8221;<br />
&#60;<strong>troj</strong>&#62; mlanghor: I don&#8217;t get those kind of questions <img src='http://blogs.splunk.com/wp-includes/images/smilies/icon_sad.gif' alt=':-(' class='wp-smiley' /><br />
&#60;<strong>troj</strong>&#62; I get more of the &#8220;I want to see just regular old log&#8230;</p>]]></description>
			<content:encoded><![CDATA[<p>This week in &#8220;That happened: notes from #splunk&#8221;, a blog about the goings-on in the Splunk IRC channel: slow learners, how not to get dizzy when configuring props and transforms, bureacracy in action, and <a href="http://i.imgur.com/esZ3e.jpg">Good Guy Splunk</a>:</p>
<h2>If you build it, they will (eventually) come</h2>
<p>(But you might have to disable their ssh access to the production hosts first):</p>
<p>&lt;<strong>mlanghor</strong>&gt; ahh, the joy in your co-worker coming by with advanced Splunk questions, &#8220;how can I use that <a href="http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Rex">rex</a> command you talked about a few weeks ago to extract something?&#8221;<br />
&lt;<strong>troj</strong>&gt; mlanghor: I don&#8217;t get those kind of questions <img src='http://blogs.splunk.com/wp-includes/images/smilies/icon_sad.gif' alt=':-(' class='wp-smiley' /><br />
&lt;<strong>troj</strong>&gt; I get more of the &#8220;I want to see just regular old log files, so how do I do that?&#8221;<br />
&lt;<strong>mlanghor</strong>&gt; oh I still get those.  of &#8216;course I still struggle with the &#8220;I&#8217;ve got ssh access to the host, why would I use that?&#8221;<br />
&lt;<strong>mlanghor</strong>&gt; since management still hasn&#8217;t cracked down on user accounts<br />
&lt;<strong>troj</strong>&gt; In test and prod we have cracked down, so Splunk is all they get to see of their logs<br />
* <strong>troj </strong>cheers!<br />
&lt;<strong>troj</strong>&gt; They get over the PlainOldLogFiles attachment when they discover, as I have repeatedly stated to them, that they can search for stuff using Splunk<br />
&lt;<strong>troj</strong>&gt; And at that point I say nice things to them when I want to say mean things <img src='http://blogs.splunk.com/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /><br />
&lt;<strong>mlanghor</strong>&gt; ahha</p>
<h2>It might not be pretty but it works</h2>
<p>New Support Splunker ^Brian^ explains how <a href="http://docs.splunk.com/Documentation/Splunk/latest/admin/Propsconf">props.conf</a> and <a href="http://docs.splunk.com/Documentation/Splunk/latest/Admin/Transformsconf">transforms.conf</a> work together in the face of some mild heckling:</p>
<p>&lt;<strong>wrench_</strong>&gt; Can someone help me understand the relationship between props.conf and transforms.conf? I&#8217;m not sure what the difference is.<br />
&lt;^<strong>Brian</strong>^&gt; transforms defines things that modify results / events / extractions.  Props applies those transforms stanzas to <a href="http://docs.splunk.com/Splexicon:Source">sources </a>/ <a href="http://docs.splunk.com/Documentation/Splunk/latest/Data/Whysourcetypesmatter">sourcetypes</a><br />
&lt;<strong>wrench_</strong>&gt; So you define the source/sourcetype in props.conf and then reference it in a stanza inside transforms.conf to make modifications?<br />
&lt;^<strong>Brian</strong>^&gt; so, say you set up a transforms stanza.  Call it [my_awesome_stanza].<br />
&lt;<strong>wrench_</strong>&gt; k<br />
&lt;^<strong>Brian</strong>^&gt; and in that stanza, lets say you define some extractions for IIS log<br />
* <strong>puercomal </strong>finds multiple layers of redirection delightfully intuitive<br />
&lt;^<strong>Brian</strong>^&gt; in props.conf, you would set up a stanza like this:  [my_awesome_iis_sourcetype]<br />
&lt;^<strong>Brian</strong>^&gt; and under that you would apply the [my_awesome_stanza] by a line like this:  REPORT-myreport = my_awesome_stanza<br />
&lt;<strong>wrench_</strong>&gt; ^Brian^: ah gotcha &#8212; thanks for the example<br />
&lt;<strong>puercomal</strong>&gt; props &#8212; DO_THING-mything = thing_that_is_mine. transforms &#8212; thing_that_is_mine &#8220;code&#8221;&#8230; regular expressions, mainly, but could also be a lookup referral as in things_lookup.csv</p>
<h2>Don&#8217;t forget</h2>
<p>Hassling your boss makes the world go &#8217;round (check that .conf link for ways to justify a trip to <a href="http://www.splunk.com/view/reasons-to-attend/SP-CAAAFHP">Splunk&#8217;s Worldwide User Conference</a>):</p>
<p>* <strong>troj </strong>makes progress on <a href="http://www.splunk.com/view/reasons-to-attend/SP-CAAAFHP">.conf</a> request<br />
&lt;<strong>troj</strong>&gt; Supervisor says OK, 665 layers of bureaucracy to go! <img src='http://blogs.splunk.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<h2>Splunk sees if you&#8217;ve been bad or good</h2>
<p>But your coworkers don&#8217;t have to:</p>
<p>&lt;<strong>Nerf</strong>&gt; Sooo, if I see &#8220;Sending email&#8221; in python.log does that mean that it was successfully sent?  I just want to make sure there weren&#8217;t any local errors before I start bugging the email admins<br />
&lt;<strong>Nerf</strong>&gt; NEVERMIND! NOTHING TO SEE HERE!  IT CERTAINLY WASN&#8217;T A FAT-FINGERD EMAIL ADDRESS!<br />
&lt;<strong>ftk</strong>&gt; haha<br />
&lt;^<strong>Brian</strong>^&gt; Nerf: <img src='http://blogs.splunk.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /><br />
&lt;^<strong>Brian</strong>^&gt; Nerf: i had that issue earlier<br />
&lt;<strong>Nerf</strong>&gt; On the plus side I was able to snoop the logs via Splunk without bothering the email admins <img src='http://blogs.splunk.com/wp-includes/images/smilies/icon_biggrin.gif' alt=':D' class='wp-smiley' /><br />
&lt;^<strong>Brian</strong>^&gt; Nerf: i set up my new indexer, was trying to get it to register as a slave of the license master.<br />
&lt;^<strong>Brian</strong>^&gt; It kept failing and I&#8221;m like wtf..i fire off an email to our network admins saying I need these ports opened between our Springfield and Wilmington data centers<br />
&lt;^<strong>Brian</strong>^&gt; they said it&#8217;s already done..so i&#8217;m looking at what I&#8217;m typing, can&#8217;t see anthing wrong..then it dawned on me..i wasn&#8217;t pointing to the license master<br />
&lt;<strong>Nerf</strong>&gt; Yeah, I was bringing up a new indexer and at one point was trying to figure out why I couldn&#8217;t reach it.  I had switched ports 8089 and 9997 and who need to get there</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.splunk.com/2012/05/10/that-happened-episode-9/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Quantifying the Benefits of Splunk with SSDs</title>
		<link>http://blogs.splunk.com/2012/05/10/quantifying-the-benefits-of-splunk-with-ssds/</link>
		<comments>http://blogs.splunk.com/2012/05/10/quantifying-the-benefits-of-splunk-with-ssds/#comments</comments>
		<pubDate>Thu, 10 May 2012 14:19:29 +0000</pubDate>
		<dc:creator>Patrick Ogdin</dc:creator>
				<category><![CDATA[Tips & Tricks]]></category>

		<guid isPermaLink="false">http://blogs.splunk.com/?p=7820</guid>
		<description><![CDATA[<p>We’ve had the question posed to us several times over the years:  “What impact would the addition of an SSD have to my Splunk environment?”  Referencing Splunk Answers:</p>
<p><a href="http://splunk-base.splunk.com/answers/10417/splunk-on-solid-state-disk">http://splunk-base.splunk.com/answers/10417/splunk-on-solid-state-disk</a></p>
<p>Raitz is dead-on in his reply.  As data flows into a Splunk indexer, we are write-I/O heavy.  Sequential write performance on SSD vs SAS is pretty similar so no real benefit for Splunk on an SSD here.  These benchmarks illustrate this.</p>
<p><a href="http://www.tomshardware.com/reviews/sas-6gb-raid-controller,3028-16.html">RAID0 w/SSD</a></p>
<p><a href="http://www.tomshardware.com/reviews/sas-6gb-raid-controller,3028-14.html">RAID0 w/SAS</a></p>
<p>(These are RAID controller benchmarks but they still demonstrate the point)</p>
<p>Since a Splunk indexing server pulls dual duty and responds to search requests as well as performs indexing, what is the impact of an SSD on search performance?  Splunk searches can be categorized in two&#8230;</p>]]></description>
			<content:encoded><![CDATA[<p>We’ve had the question posed to us several times over the years:  “What impact would the addition of an SSD have to my Splunk environment?”  Referencing Splunk Answers:</p>
<p><a href="http://splunk-base.splunk.com/answers/10417/splunk-on-solid-state-disk">http://splunk-base.splunk.com/answers/10417/splunk-on-solid-state-disk</a></p>
<p>Raitz is dead-on in his reply.  As data flows into a Splunk indexer, we are write-I/O heavy.  Sequential write performance on SSD vs SAS is pretty similar so no real benefit for Splunk on an SSD here.  These benchmarks illustrate this.</p>
<p><a href="http://www.tomshardware.com/reviews/sas-6gb-raid-controller,3028-16.html">RAID0 w/SSD</a></p>
<p><a href="http://www.tomshardware.com/reviews/sas-6gb-raid-controller,3028-14.html">RAID0 w/SAS</a></p>
<p>(These are RAID controller benchmarks but they still demonstrate the point)</p>
<p>Since a Splunk indexing server pulls dual duty and responds to search requests as well as performs indexing, what is the impact of an SSD on search performance?  Splunk searches can be categorized in two ways, sparse and dense.  Dense reporting searches may request the average response time of a particular application over the last 24 hours for example.  Sparse searches are the “needle in a haystack” searches. A sparse search may look something like &#8220;find me this user ID in all of my data over the last year&#8221;.  For dense searching, Splunk’s I/O footprint can be characterized as a lot of sequential reads.  Referring to our benchmarks above, sequential reads on SSD are also about the same as on the SAS drives.  For sparse searching, the Splunk I/O behavior is full of random seeks.  This is where Splunk shines on SSD.</p>
<p>&nbsp;<br />
<strong>Hardware</strong></p>
<p>Three machines were used for this benchmark.  We’ve classified them by their disk speed.  CPU and memory were not identical.</p>
<p>7200 – 2&#215;4 2.40GHz, 16GB, 12x2TB 7200 RPM SATA RAID 10<br />
10k – 2&#215;6 2.677GHz, 48GB, 4x900GB 10K RPM SAS RAID 10<br />
15k – 2&#215;6 2.667GHz, 12GB, 6x146GB 15K RPM SAS RAID 10<br />
SSD – 2&#215;4 2.40GHz, 16GB, 1x240GB (same as 7200 w PCIe SSD)</p>
<p>&nbsp;<br />
<strong>Load Generation</strong></p>
<p>We’re using a script that runs searches against the Splunk instances above for a 5-minute period.  The searches look for a random user id that we have generated between 1 and 1 million.  We can control the number of searches executing concurrently and have tested at increasing concurrency from 1 to 32.  In a real world Splunk setup this single concurrent search workload would look similar to an individual submitting 1 search at a time, then waiting for results and submitting another search.  A test with 32 concurrent searches would look like 32 Splunk users each submitting 1 search at the same time, each waiting for a result, then each submitting another search.</p>
<p>&nbsp;<br />
<strong>Results</strong></p>
<p>The chart below represents how many distinct searches were able to complete in a 1-minute time frame for each of these I/O setups.</p>
<p><a href="http://blogs.splunk.com/wp-content/uploads/2012/05/ssd_results1.png"><img class="alignleft size-full wp-image-7827" src="http://blogs.splunk.com/wp-content/uploads/2012/05/ssd_results1.png" alt="" width="770" height="593" /></a></p>
<p>&nbsp;<br />
So, for example, with 1 concurrent user, the 7200 I/O setup was able to execute 9 searches in a 1-minute span for an average search execution time of around 6.5 seconds.  This is not bad at all and helped along by a feature we released in Splunk 4.3 called bloom filters that reduces the amount of time searches take looking for rare terms: </p>
<p><a href="http://docs.splunk.com/Documentation/Splunk/latest/Admin/Bloomfilters">http://docs.splunk.com/Documentation/Splunk/latest/Admin/Bloomfilters</a></p>
<p>But holy crap, look at the SSD results!  At 32 concurrent searches we are able to complete almost 2000 searches per minute.  This is a manifestation of SSD’s having superior random read performance over a traditional hard disk drive.</p>
<p>&nbsp;<br />
<strong>Conclusion</strong></p>
<p>As the $/GB of SSD’s continues to improve versus traditional hard disk drives, it makes sense to evaluate them for Splunk environments where you might reap order of magnitude or greater return on search thruput.  In fact you could even make the argument that since other workloads are nearly at parity and sparse searches in Splunk have such huge upside on SSD, you should consider putting your hot and warm Splunk indices on SSD with cold perhaps on spinning media.  I’m not saying that there aren’t other factors you should weigh when deploying enterprise SSDs but with performance like this, it should definitely be on your radar.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.splunk.com/2012/05/10/quantifying-the-benefits-of-splunk-with-ssds/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Identifying Phishing Sites in Your Events</title>
		<link>http://blogs.splunk.com/2012/05/07/identifying-phishing-sites-in-your-events/</link>
		<comments>http://blogs.splunk.com/2012/05/07/identifying-phishing-sites-in-your-events/#comments</comments>
		<pubDate>Mon, 07 May 2012 22:08:16 +0000</pubDate>
		<dc:creator>Nimish Doshi</dc:creator>
				<category><![CDATA[Security]]></category>
		<category><![CDATA[Tips & Tricks]]></category>
		<category><![CDATA[phishing]]></category>
		<category><![CDATA[phishtank]]></category>
		<category><![CDATA[scripted input]]></category>

		<guid isPermaLink="false">http://blogs.splunk.com/?p=7806</guid>
		<description><![CDATA[<p>Recently, I thought I was caught in a phishing scheme where I created an account on an e-commerce site to checkout and as soon as I clicked on the checkout button, it asked me to log onto a well known site. It turned out that the original site was badly implemented and it should have told users that they are affiliates with the other site. Nevertheless, I went to <a href="http://http://www.phishtank.com/">Phishtank</a> to make sure that no one had complained about the original e-commerce site.</p>
<p>This got me thinking that since phishing occurs all too often, there must be a way for a corporations to verify that their users are not going to phishing sites and if they are to know&#8230;</p>]]></description>
			<content:encoded><![CDATA[<p>Recently, I thought I was caught in a phishing scheme where I created an account on an e-commerce site to checkout and as soon as I clicked on the checkout button, it asked me to log onto a well known site. It turned out that the original site was badly implemented and it should have told users that they are affiliates with the other site. Nevertheless, I went to <a href="http://http://www.phishtank.com/">Phishtank</a> to make sure that no one had complained about the original e-commerce site.</p>
<p>This got me thinking that since phishing occurs all too often, there must be a way for a corporations to verify that their users are not going to phishing sites and if they are to know about it when it does happen through alerts. What I ended up doing was building a simple app, called <a href="http://http://splunk-base.splunk.com/apps/47440/phishing-lookup">Phishing Lookup</a>, available on <a href="http://splunk-base.splunk.com/apps/">Splunkbase</a>, that can used to automate this exercise using the data from the phishtank.</p>
<p>What the app does is once a day (or it could be configured to once a hour) it downloads the latest list of verified phishing sites as a CSV file through Splunk&#8217;s scripted input. I provide two ways to do the correlation to see if your events contain any web addresses that are known phishing sites. First, I provide a simple form search dashboard where you input one of your event sourcetype names, the field in your sourcetype that represents a URL, and a time range. After the search returns, if you get no results, that&#8217;s a good thing. If you do get results, you may want to investigate why your applications or browsers have been surfing known phishing sites.</p>
<p><a href="http://blogs.splunk.com/wp-content/uploads/2012/05/phishing_lookup.jpg"><img src="http://blogs.splunk.com/wp-content/uploads/2012/05/phishing_lookup.jpg" alt="" width="942" height="720" class="aligncenter size-full wp-image-7807" /></a></p>
<p>The other way to use this is to set up a Splunk alert by calling the included macro phishing(sourcetype name, name of URL field) on a schedule. If the number of events returned is greater than zero, the alert action should be executed. This automates the process rather than having to do this manually by using the dashboard.</p>
<h3>Real World Usage</h3>
<p>This by itself sound theoretical, so how would you use it in the real world? One data source that comes to mind are your proxy logs as they have definite evidence that your user or application attempted to contact a site. Even if you have network software in place to block the eventual connection, it would be worth knowing that the attempt was made. If you are using Bluecoat proxy logs, there is <a href="http://splunk-base.splunk.com/apps/22335/splunk-for-bluecoat">already an app to report on Bluecoat events</a> upon which you could then correlate with phishing data, but the correlation with any set of proxy events should be possible with my simple phishing lookup app.</p>
<p>We should not stop there as many phishing attacks originate with email and often have patterns in subjects that make identifying them a little easier. If you use Exchange, you could install the <a href="http://splunk-base.splunk.com/apps/28976/splunk-app-for-microsoft-exchange">Exchange App</a> on Splunkbase to monitor these devious subjects. Also, mail that contains only one line links and no subject may be suspicious.</p>
<p>Often the goal of a phishing attack is to make you log into some site that you think is legitimate to steal credentials and other forms of identity. Some attacks may have a different purpose where simply clicking on the link in an email or a web site may initiate the installation of malware, which may go unnoticed for a long time. In this situation, not only would installed anti-viruses, anti-virus logs, and endpoint protection be valuable, but also an inventory of installed desktop apps may help in an investigation of unapproved software. For instance, on Splunkbase, the <a href="http://splunk-base.splunk.com/apps/47372/splunk-app-for-citrix-xendesktop">Splunk App for Citrix Xen Desktop</a>, could be used to take an inventory of all virtual and physical desktops to see where else suspicious malware may be installed.</p>
<p>Finally, if you have been using Splunk for some time with these various sources, you may want to use all your apps along with their event data to see if the same phishing attack occurred months ago using the same investigative approaches of looking at proxy events, web access logs, email subjects, and desktop inventories. This would help identify the Advance Persistent Threat, something which may not be possible with traditional SIEM vendors that do not store events for as long as you need them for forensic search and alerts. In summary, I hope my simple app to correlate phishing sites with your data and the points in this article are useful in maintaining your network&#8217;s security.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.splunk.com/2012/05/07/identifying-phishing-sites-in-your-events/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

