Splunk at Yahoo!: Big Data at Scale
Big Data is a term that’s thrown around a lot by vendors, thought leaders and the press—so much so that it’s nearly lost all meaning. In fact, most people skip “big” and immediately discuss how it’s about more than just the amount of data (and it is). That said, we should take a moment to recognize what true big data scale means.
Today we announced that Yahoo is using Hunk to analyze 600 petabytes (yes, that’s a “p”) of data in Hadoop and is analyzing over 150 terabytes per day with Splunk Enterprise. That’s real scale, and Yahoo is using the Splunk platform to get there. But while the amount is interesting, what’s really compelling is how the company is using the data.
With Hunk, the company is tracking and improving the performance and stability of its grid system, and tracking the system metrics of all of its clusters. Yahoo uses analytics on Hadoop to visually browse complex tables, meet SLAs and gain insights into historical resources. By using Hunk, the company is saving millions of dollars per year in hardware provisioning alone.
Yahoo has also deployed Splunk Enterprise as its platform for machine data. Teams ranging from IT operations, infrastructure, products and security are using Splunk Enterprise to maximize revenue by understanding customer preferences, advertising and marketing campaign popularity, and click through rates, while also addressing IT workflow issues.
By implementing Hunk and Splunk Enterprise, Yahoo is pushing the boundaries of data analytics on the “big” side of big data, but in a way that’s also addressing core needs of the business. That’s the full power of big data—and the power of the Splunk platform.