Splunk your Google Analytics

Gain more insight into site performance and user activity by correlating Google Analytics data within Splunk.

A customer of mine recently wanted to understand more about the journey that retail consumers take when they arrive at its website. They recognized that consumers who have previously bought from the site will have more familiarity with the design and layout than those visiting the site for the first time. In addition, consumers who went directly to the site would have a greater brand engagement than those who were referred from an affiliate site.

If only we could implement a method to back up the data that gets submitted to  Google Analytics, also sending it back to the local Apache web server logs and into Splunk.

Using the following change to the client side Google Analytics javascript code block already implemented on their site, we were able to start sending the Google Analytics payload back to the local site web server.



(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),



ga(‘create’, ‘UA-XXXXX-YY’, ‘auto’);

// START local backup of GA data request for Splunk

ga(function(tracker) {

var originalSendHitTask = tracker.get(‘sendHitTask’);

tracker.set(‘sendHitTask’, function(model) {

var payLoad = model.get(‘hitPayload’);


var gifRequest = new XMLHttpRequest();

// Send __ua.gif to the local server

var gifPath = “/__ua.gif”;

gifRequest.open(‘get’, gifPath + ‘?’ + payLoad, true);




// END local backup of GA data request for Splunk

ga(‘send’, ‘pageview’);


The code snippet simply sends an XMLHttpRequest containing the payload to a 1×1 pixel .gif file uploaded to the local web server. The .gif file simply acts as an endpoint to receive the requests so they get logged locally.

This method captures all of the GA tracking information configured on a site and any additional client side information unavailable to standard server side web logs i.e. Screen Resolution, Viewable Screen Size, Screen Colour Depth & User Language.

Leveraging the Client ID generated by the Google Analytics library also allows the identification of users even before they are logged into a site, easily providing previously unknown information about user behavior.

Although this gathers the same data as Google Analytics there was a discrepancy in the numbers between the numbers returned by Splunk and those in the Google Analytics ad hoc dashboards. Further research revealed that Google Analytics performs data sampling to provide satisfactory preformance for ad-hoc reporting.



Splunk has to do neither (unless you want it to) and gives un-sampled statistics on visitor activity. Additionally, Splunk with this additional tracking information, gives a more complete view of user interaction for a single user across multiple devices even for multiple users behind a proxy.

So what are you waiting for? Splunk your Google Analytics data to enrich and correlate data from your users’ interaction with your web site!