Splunk and Microsoft Azure – Intro and Resource Roundup
Note: the below article was written back in Dec 2014, but still gets a ton of hits and questions. Be sure to check out the Azure tag here on Splunk Blogs for the latest news.
We are often asked by customers about how Splunk can integrate with, or run in Microsoft’s Azure cloud platform. There’s actually a fair bit of information about this broad topic on splunk.com and elsewhere, but it can be a bit hard to find. This post will serve as an introduction to a few Azure terms, and a round-up of available resources. Subsequent posts will cover some of these concepts in more detail–just look for the posts tagged “Azure”! You might also want to check out the Microsoft tag for other resources related to Splunk and overall Microsoft ecosystem.
First, let’s be clear: this is a HUGE topic. Cloud platforms are very complex these days, and Azure is no exception. If you walk up to a Splunker and ask, “can Splunk run in Azure?”, or “can Splunk integrate with Azure?”, well the answer is “yes“. If you actually want a helpful answer, be prepared for us to ask for just a bit more information!
Second, let’s set a baseline of understanding with some simple definitions and statements for those new to Azure:
- Azure is PaaS — Platform as a Service. This means that you can develop custom applications on top of it. In the PaaS world, you (typically) don’t work with servers, instead you have services. Very important to understand.
- Azure is also IaaS — Infrastructure as a Service. This means they also have servers, which are going to be virtual machines running on top of a customized version of Hyper-V. Besides compute, they offer other services which one might consider “infrastructure”, but generally when someone’s using that word in the cloud, they mean VMs.
- Azure has a crapton of services. At this moment, looking at the Azure Management Portal, I am counting (including the ones in preview):
- 5 compute services, including Website, Virtual Machine
- 7 data services, including SQL Database, HDInsight
- 11 app services, including Service Bus, and Active Directory
- 2 network services: Virtual Network, and Traffic Manager
- And a Marketplace where there’s a bunch of cool stuff from third-parties
- Some Compute resources are suitable for running Splunk, some are not. The three services that it’s key to to understand are highlighted below. Also check out this article which explains the differences.
- Website are packaged and fully integrated web applications that are hosted in IIS.
- Virtual Machines are multi-purpose VMs which can run anything you can run on their supported guest operating systems, which includes many versions of Windows and Linux.
- Cloud Services are for running multi-tier web applications in Azure, where the management of the OS and VM are abstracted away. These are stateless virtual machines, with the OS management abstracted away. Two important components to understand relevant to running Splunk in Azure are:
- Web role is a dedicated IIS web server instance. It’s not intended to host arbitrary executables, but you could run a Splunk Forwarder in one with some scripting. Note that there are pros and cons to this approach which will be discussed in a future article.
- Worker role is a more general purpose resource, and can run applications which are not hosted in IIS. But because it is stateless, your applications may require some level of integration and automation to work as expected. You can run a forwarder here, and doing so likely makes more sense than in a web role, but it all depends on your requirements.
- All of these services have machine data. Getting that data into Splunk may require some special techniques unique to each service, although there are a handful of storage services for data in an Azure cloud which you need to understand. I’ll just paste directly from the Intro to Azure Storage docs:
- Blob storage stores file data. A blob can be any type of text or binary data, such as a document, media file, or application installer.
- Table storage stores structured datasets. Table storage is a NoSQL key-attribute data store, which allows for rapid development and fast access to large quantities of data.
- Queue storage provides reliable messaging for workflow processing and for communication between components of cloud services.
- File storage (Preview) offers shared storage for legacy applications using the standard SMB 2.1 protocol.
Ok, enough preamble, here’s what I’ve found for resources related to Splunk + Azure that should get you started down that path:
Apps like these are VERY important. The topic of getting data out of Azure and into Splunk deserves its own blog post, if not several. Why? The answer is simple: Splunk doesn’t natively know how to read data from a blob container, an Azure table. or an Azure queue. But no worries, Splunk is a platform!
- Splunk Addon for Microsoft Azure — This app can read from both table and blob storage, and send the data within to Splunk. This app only runs on Windows, and can be used with Splunk Enterprise, or run on a Forwarder. Written by Splunk, but not supported at this time.
- Azure Diagnostics — This app reads from blob storage, and is written in Python, so ought to be cross-platform. This is a community app, so also does not have official support.
- AMQP Messaging Modular Input — This community-supported app was written by a Splunker. It can read from many kinds of message buses, including the Azure Service Bus
- Azure by Splunk Monitoring — The description is a bit cryptic, but this appears to be a framework for monitoring Azure services using Splunk, and perhaps sending data as well. C#. Includes VS 2012 project.
- splunk-azure-website-logs — Splunk App for downloading Azure Website diagnostic data into Splunk. Written by a Splunker in Python.
Not many things to mention yet, hopefully this list will grow!
- If you go to the Azure Marketplace and search for Splunk, you will find two items: Azure AD support for Splunk Enterprise, and for Splunk Storm. These integrations enable you to use Azure Active Directory to authenticate your end users into Splunk Enterprise running on-premises, or our free machine data analysis product in the cloud, Splunk Storm.
- Here’s the Azure tag on the Splunk Answers site. Not many Q&A there as of yet, but there are some good ones in there, such as:
- Found on Stack Overflow: How to setup Splunk Universal Forwarder on Windows Azure web role?
Searching on the .conf website, I was able to find five slide decks! You can browse all of the past sessions by going to the 2013 sessions or 2014 sessions pages. Video recordings are available for most .conf2014 sessions.
- .conf2013 — Best Practices: Deploying Splunk on Physical, Virtual, and Cloud Infrastructure
- .conf2013 — Windows Inputs and Microsoft Apps Strategy
- .conf2013 — Customize and Extend with the Splunk Developer Plaform
- .conf2013 — Splunk Apps for Monitoring Microsoft Based Infrastructure
- .conf2014 — Using Telemetry to Understand the Customer Experience at Microsoft (download recording)
- Presenter: Simon Warrington, Sr. Program Manager, Microsoft.
- Description: “Are you a gamer? Did you know that Splunk helps ensure an excellent gaming experience for Xbox One? This session will detail how Xbox went from a homegrown solution to a solution based on Splunk, gaining near real-time visibility into individual user experience as well as population and partner trends and system performance.”
- 12/19/14: clarified web & worker roles and teased some UF suitability concerns that will be covered later in detail