Configuring Nginx Load Balancer For The HTTP Event Collector

The HTTP Event Collector (HEC) is the perfect way to send data to Splunk, at scale, without a forwarder. If you’re a developer looking to push logs into Splunk over HTTP or you have an IOT use case then the HEC is for you. We cover multiple deployment scenarios in our docs. I want to focus on a single piece of the following distributed deployment for high availability, throughput and scale; the load balancer.

You can use any load balancer in front of the HEC but this article focuses on using Nginx to distribute the load. I’m also going to focus on using HTTPS as I’m assuming you care about security of your data in-flight.

You’re going to need to build or install a version of Nginx that enables HTTPS support for an HTTP server.

./configure --with-http_ssl_module

If you install from source and don’t change the prefix then you’ll have everything installed in /usr/local/nginx. The rest of the article will assume this is the install path for Nginx.

Once you’ve got Nginx installed you’re going to need to configure a few key items. First is the SSL certificate. If you’re using the default certificate that ships with Splunk then you’ll need to copy $SPLUNK_HOME/etc/auth/server.pem and place that on your load balancer. I’d highly encourage you to generate your own SSL certificate and use this in place of the default certificate. Here are the docs for configuring Splunk to use your own SSL certicicate.

The following configuration assumes you’ve copied server.pem to /usr/local/nginx/conf.

    server {
        # Enable SSL for default HEC port 8088
        listen 8088 ssl;

        # Configure Default Splunk Certificate. 
        # Private key is included in server.pem so use it in both settings.
	ssl_certificate     server.pem;
    	ssl_certificate_key server.pem;		

	location / {
            # HEC supports HTTP Keepalive so let's use it
	    # Default is HTTP/1, keepalive is only enabled in HTTP/1.1
  	    proxy_http_version 1.1;

  	    # Remove the Connection header if the client sends it,
  	    # it could be "close" to close a keepalive connection
  	    proxy_set_header Connection "";

            # Proxy requests to HEC
            proxy_pass https://hec/services/collector;
	}
    }

Next we’ll configure the upstream servers. This is the group of servers that are running the HTTP Event Collector and auto load balancing data to your indexers. Please note that you must use a heavy forwarder as the HEC does not run on a Universal Forwarder.

    
    upstream hec {
        # Our web server, listening for SSL traffic
        # Note the web server will expect traffic
        # at this xip.io "domain", just for our
        # example here
	keepalive 32;

        server splunk1:8088;
        server splunk2:8088;
    }

Now let’s put it all together in a working nginx.conf

# Tune this depending on your resources
# See the Nginx docs
worker_processes  auto;

events {
    # Tune this depending on your resources
    # See the Nginx docs
    worker_connections  1024;
}


http {
    upstream hec {
        # Our web server, listening for SSL traffic
        # Note the web server will expect traffic
        # at this xip.io "domain", just for our
        # example here
	keepalive 32;

        server splunk1:8088;
        server splunk2:8088;
    }

    server {
        # Enable SSL for default HEC port 8088
        listen 8088 ssl;

        # Configure Default Splunk Certificate. 
        # Private key is included in server.pem so use it in both settings.
	ssl_certificate     server.pem;
    	ssl_certificate_key server.pem;		

	location / {
            # HEC supports HTTP Keepalive so let's use it
	    # Default is HTTP/1, keepalive is only enabled in HTTP/1.1
  	    proxy_http_version 1.1;

  	    # Remove the Connection header if the client sends it,
  	    # it could be "close" to close a keepalive connection
  	    proxy_set_header Connection "";

            # Proxy requests to HEC
            proxy_pass https://hec/services/collector;
	}
    }
}

When you start Nginx you will be prompted to enter the PEM passphrase for the SSL certificate. The password for the default Splunk SSL certificate is password.

There are a bunch of settings you may want to tweak including HTTPS Server Optimization, load balancing method, session persistence, weighted load balancing and health checks.

I’ll leave those settings for you to research and implement as I’m not an expert on them all and everyone’s deployment will differ in complexity and underlying resources.

Hopefully this gives you the foundation for a reliable load balancer for your distributed HTTP Event Collector deployment.

4 Trackbacks

  1. […] from Splunk Blogs http://blogs.splunk.com/2016/05/24/configuring-nginx-load-balancer-for-the-http-event-collector/ […]

  2. […] with Splunk, REST API & SDK compatibility. Yesterday, I posted an article on how to configure Nginx as a load balancer in front of a tier of HTTP Event Collectors. Today, I want to iterate on the work I did yesterday and show a basic config for Nginx […]

  3. […] with Splunk, REST API & SDK compatibility. Yesterday, I posted an article on how to configure Nginx as a load balancer in front of a tier of HTTP Event Collectors. Today, I want to iterate on the work I did yesterday and show a basic config for Nginx […]

  4. […] with Splunk, REST API & SDK compatibility. Yesterday, I posted an article on how to configure Nginx as a load balancer in front of a tier of HTTP Event Collectors. Today, I want to iterate on the work I did yesterday and show a basic config for Nginx […]