From big data to a 360 degree customer view with Hunk and Hortonworks

Age of the customerYou can’t really escape the fact that we’re in the age of the customer. From CRM to the “long tail” to multi-channel to social media brand sentiment to Net Promoter Scores – it is all about customer experience. Big Data has an important part to play – no great revelation there but how do you actually do it? There are an awful lot of questions that come up when it comes to Big Data and customer view;

What should my architecture be? How do I put together the right data strategy for the short and long term? How do I get the value from the data? How do I build customer analytics on top of my data? How do I …

» Continue reading

Get Value Out of Your Data in Hadoop, Starting Today

For years we’ve been working with thousands of companies using Splunk for big data solutions that range from security to business analytics and everything in between. The best part is our customers often discover exciting ways to use Splunk and teach us what the product can really do. As you can imagine, all of the customer conversations, product implementations and ROI stories have given Splunk a treasure trove of experience with big data and big data solutions.

So when our customers let us know that getting large amounts of data into Hadoop is straightforward, but getting analytics out is the challenge, we knew there had to be a better way. Customers asked us to make it faster and easier for …

» Continue reading

Hunk Setup using Hortonworks Hadoop Sandbox

Hortonworks Sandbox is a personal, portable Hadoop environment that comes with a dozen interactive Hadoop examples. Recently Hortonworks and Splunk released a tutorial and video to install and connect Hunk with the Hortonworks Hadoop Sandbox version 1.3

This blog summarizes the configurations used as part of the Hunk setup.

Configurations for Hadoop Provider:

Key Value
Java Home /usr/jdk/jdk1.6.0_31
Hadoop Home /usr/lib/hadoop
Hadoop Version Hadoop version 1.x, (MR1)
Job Tracker sandbox:50300
File System hdfs://sandbox:8020
Splunk search recordreader com.splunk.mr.input.SimpleCSVRecordReader, com.splunk.mr.input.ValueAvroRecordReader

 

Configurations for Hadoop Virtual Indexes:

Key Value
Name hadoop_sports
Path to data in HDFS /user/hue/raanan/…
Whitelist \.csv$

 

 

For more Hunk details and examples go to the blog:

http://blogs.splunk.com/2013/11/08/hunk-intro-part-3/

Enjoy.

 …

» Continue reading

Big data and financial services – an EMEA perspective

I was lucky enough to attend the first day of the “Big Data in Financial Services” event in London a few days ago. I know some people might not think of that as lucky but I say it on the back of a surprisingly varied agenda, entertaining speakers and a lot of good debate and content on what big data means to FS companies and how they are using it.

The key point that I took away was that right now, FS companies are using big data today to focus on operational issues – risk, efficiency, compliance, security and making better decisions. However, there is a growing trend in FS companies looking at how big data is going …

» Continue reading

Further Simplifying Big Data Analytics

In the past we’ve talked about simplifying big data analytics and the 80:20 rule for data analysis. Most organizations spend 80% of analytics efforts running and optimizing the business and 20% on advanced analytics, which includes advanced data mining, algorithm development and advanced predictive modeling.

Hadoop has seen very good adoption for big data analytics, specifically batch analytics for large datasets, and many organizations have initiatives to use it for advanced analytics and optimizing the business. Unfortunately, those organizations are struggling to derive value from their Hadoop implementations. They’re finding that analysis takes too long and requires specialized talent. Another issue is that getting data into Hadoop is difficult, getting meaningful analysis even more challenging.

In the past few months, …

» Continue reading

Splunk Hadoop Connect 1.1 – Opening the door to MapR; now available on all Hadoop distributions

I am happy to announce that Splunk Hadoop Connect 1.1 is now available. This version of Hadoop Connect rounds out Splunk’s integration with the Hadoop distributions by becoming certified on MapR. Cloudera, Hortonworks, and Apache Hadoop distributions also have the ability to benefit from the power of Splunk.

Splunk Hadoop Connect provides bi-directional integration to easily and reliably move data between Splunk and Hadoop. It provides Hadoop users the ability to gain real-time analysis, visualization and role based access control for a stream of machine-generated data. It delivers three core capacities: Export data from Splunk to Hadoop, Explore Hadoop directories and Import data from Hadoop to Splunk.

The most significant new feature added to version 1.1 is the …

» Continue reading

Shuttl – A New Year a New Release

Data is the life blood of the modern business. Managing the flow of data, however, is as important as the data itself. That is why Shuttl was created. Through Shuttl users can move (nay, shuttl!) buckets of data from Splunk to other systems and back again. This has proved immensely useful as people realize how data can be used and reused to drive business value.Happy New Year 2013!

The Elves have been busy at work bringing Shuttl users a bunch of goodies in the form of the new 0.7.2 Release. Christmas came early when the code landed in Master on Github 6 days before Santa’s big night, and now it’s available for download on Splunkbase!

Since Shuttl’s release last year, …

» Continue reading

Hadoop and Splunk Use cases

Customer Examples – Using both Splunk and Hadoop

The Splunk and Hadoop communities can benefit from each other’s strengths. Below are several examples of customers that use both environments.

Use Case Description
1 – Splunk then Hadoop Splunk collects, visualizes, and analyzes the data and passes it to Hadoop for ETL and other batch processing
2 – Hadoop then Splunk Hadoop Collects the Data, and passes the results to Splunk for Visualization
3 – Data flows in both directions Splunk and Hadoop collect different artifacts and share the data that Hadoop needs for ETL or batch analytics and Splunk needs for real-time analysis and visualization
4 – Side-by-Side Both Splunk and Hadoop are used by the organization, but are used
» Continue reading

Simplifying Big Data Analytics

Most analytics and data projects have started thinking of investing in big data initiatives.  With so much buzz about big data, organizations have started investing or are thinking of investing in Hadoop While it is great to stay on top of trends, it often ends up being another investment where the full benefit and potential is simply not realized. The learning curve is too steep and the time to implement too high. Current analytics resources lack the strong programming skills required to conduct even simple analysis tasks and activities using Hadoop. In this post, I would like to focus on providing a better understanding of what types of analysis are better suited for Hadoop vs. non-Hadoop technologies in order to simplify …

» Continue reading

Building your big data reference architecture

With all of the value now being placed on data and the ability to use that data to improve customer experience, optimize revenue and enable growth in business the ability to find a way to ingest and save the data is critical. While there is a lot of advertising and press about many solutions ability to address any needs of the enterprise where does a CXO turn to figure it all out? In the past 2+ years I have evaluated solutions in the “big data” space to address all of the problems the IT and business users threw at me. In all of the evaluation, testing and validation of products I found that there is no single solution now or …

» Continue reading