thebaumblog: Homepage

How Much More Free Can Free Get?


Well if you ever wanted to integrate Splunk into your own product or service, free is now really, well … free. We’ve always had a free Splunk license for end users. But now we have the same for software, hardware and service provider partners. Now as a Splunk Powered Associate you can distribute Splunk with the free license key as part of your offering. You can also link to the Splunk free license download and earn referral credits if the download leads to a purchase. Pretty cool heh? Now the free license is still limited to the 500MB daily uncompressed indexing volume but hey that’s a lot of data for free.

A few of our Splunk Powered partners have picked up on the real potential here. F5 Networks, for example, has created a Splunk App that pre packages searches, alerts, reports and dashboards for F5’s ASM and FirePass products. Now F5 customers get real-time search, alerting, reporting and analytics for free with
Splunk for use with F5 Networks. Support for F5 LTM and BIG IP is coming soon.

And the folks over at RightScale are taking Splunk into the clouds. RightScale is a great cloud computing management platform that let’s you control your cloud resources across several different providers from one interface. We use RightScale at Splunk to control our demo instances on Amazon EC2/S3. Each demo instance consists of one or more servers running in the cloud that recreate a live IT environment like a J2EE-based E-commerce application, a converged network or a rack of Microsoft Windows Servers. It’s important that we are able to scale these instances up and down dynamically and RightScale comes to the rescue. The integration of
Splunk and RightScale gives cloud us the IT control and visibility we need.

Every piece of software, hardware and service on the planet generates IT data. And now you can bring Splunk to your community by integrating it into your solution at no cost to you, your channel or your customers. To join the Splunk Powered Associate program just Sign-up to be a Splunk Powered partner and we’ll take it from there.

Happy Splunking!

Splunk Live San Francisco. It’s about time.

Last night we hosted more than 100 people at our first ever Splunk Live in San Francisco. It was about time. In May 2007 we started our first series of Splunk Live events. We’ve traveled all around the world from Santa Clara, Los Angeles, Phoenix, San Diego, Dallas, Chicago, New York, Washington DC, Atlanta, London, Zurich, Singapore, Taipei, Shanghai, Bejing, Bangkok and Hong Kong. But never have we had an event in our own backyard. Congratulations to Steve Sommer and our Marketing Team for pulling it off.

The event took place in our new offices at 2nd and Brannan Street.

Little known fact that for the first two years at Splunk we actually never had an office of our own but squatted in the offices of venture capitalists and other start-up companies like Six Apart. Having a conference room called “BIG” where we can actually fit more than 100 people still takes some getting use to.

The best part of course to every Splunk Live are the customer presentations. Last night we were honored to have three local customers show everyone how they are using IT Search.

  • Mashery, The leading provider of API management services enabling companies to easily leverage web services as a distribution channel, discussed how they use Splunk to power self-service reporting for their customers on activity within their hosted, cloud-based services.
  • Lawrence Livermore National Labs LLNL, a US Dept of Energy national lab talked about their Splunk deployments in multiple groups and data centers addressing a wide range of needs, from application availability to meeting FISMA security regulations. They drive a range of initiatives from high performance computing to nuclear weapons development to running particle accelerators.
  • Visa International- The world’s largest retail electronic payments network, and one of the most recognized global financial services brands, will share how they use Splunk for network security monitoring and incident response.

Stay tuned to our events page for more upcoming Splunk Live events next year. We plan to visit several cities each quarter and will likely be in your neighborhood at some point in the near future.





Human and Machine Language Mashups at Splunk Live Zurich, Switzerland

At Splunk Live in Zurich this week an interesting discussion erupted about human and machine languages. Before I continue with the story, I want to thank everyone that attended the event. Despite the fact that Raffy Marty is a resident celebrity, this was our first formal customer and partner event in Switzerland. We had more than 50 people attend for several hours to talk about Splunk and data center management challenges. The event was co-hosted by T-Systems.

Thank you Meno Schnapauff for your great presentation on how T-Systems and the Swiss National Railway are using Splunk!

Other attendees included folks from Swisscom, Unicom Consulting, Rothschild Bank, Genossenschaft Migros, LeShop, Netcetera, Cablecom GmbH, TBK-Patent Munich, On Line Video 46, Skyguide, PostFinance and the Univestity of Fribourg. Brian Haynes, Tim Thorpe, Julie Duncan and Hash Basu-Choudhuri from our London office participated too.

Now part of the reason I mention all these names (in addition to thanking folks) is to the point of this post. In the room we had an American (me), several native English speakers from different areas of England, Swiss German speakers from Switzerland and German speakers from Germany. What I noticed is how two people think they speak the same language but can’t always understand each other. It turns out there are a lot of American (some West Coast) colloquialisms I use that my “queens English” counterparts don’t understand. And of course most of the time I try to make a joke the Swiss and Germans just look at me like I’m from outer space even though if you asked them they’d say they speak fluent English. During the event the Swiss Germans had trouble understanding the Germans and the Germans had trouble understanding the Swiss Germans. The folks from the UK who spoke German didn’t understand either the Swiss German or the German German although they all claim to speak German.

What does all this have to do with IT you ask? Well it turns out that mashing up languages and attempting to understand each other even though we don’t speak exactly the same language is one of the biggest problems we have in trying to understand our IT systems as well.

“One of the questions posed at the event was how can I modify my system and application logging to some standard in order to follow what my systems are doing? Do we need a logging standard?”

I have long been telling people that logging standards are a waste of time. IBM’s Common Base Events (CBE) has been around for decades and has very little traction in the real world. Data Center Mark-up Language (DCML) was pushed by Opsware and lots of smart people. It got nowhere. Logs exist. Instrumentation exists. Our IT systems already have tremendous amounts of data. Trying to retrofit that data to some standard is impossible. Attempting to organize a multi-vendor logging standard will never happen. Getting developers to log consistently sounds great but I’ve never seen it done before.

What we need is a mashup of machine languages and logging formats. That’s exactly what IT Search is!

Humans need to stop thinking about how we can format data to make it easier for machines to work with it. There is too much data. The real value is being about to work with massive amounts of data without any human intervention. This is exactly what Google does for the web. Sure you can reformat your HTML to get better search results. But even if you do nothing Google will index your site. You don’t even have to tell Google to do it!

I’m going to start sharing more of our experiences helping people see the connections that already exist in their logging data. While the connections are not always obvious to the naked eye and human linear thinking, machines are great at teasing out non-obvious relationships. This is perhaps the most compelling thing we work on at Splunk and continue to push the bleeding edge of what’s possible.

Splunk Voted Fastest Growing Company in Silicon Valley

I’ve just returned from the Deloitte Technology Fast 50 awards dinner where Splunk was selected as the fastest growing company in Silicon Valley. Delloite, Silicon Valley Bank, Korn Ferry International, Cornish & Carey, Cooley Goward Kronish and adb Insurance Services were the sponsors of this year’s competition and we thank them all for the award.

I was joined at the awards dinner by my two co-founders Erik Swan and Rob Das. What a great ride it has been over the past four and a half years. The time has flown by so quickly and it seems like we still have so much more to do. But it was nice at least for one evening to take a breather and enjoy what we have accomplished.

Since I graduated from college with a degree in computer science I have dreamed of creating a technology and a company that had the potential to achieve what Splunk has. Seems unreal that we are now here living that dream.

The award ceremony was held at the Computer History Museum in MountainView, CA. What a cool place. When the Boston Computer Museum closed in 1999 the museum in Silicon Valley became the keeper of computer technology history. Wandering through the museum I spotted an exhibit on chess software competition and was reminded by one of the long job outputs hanging from the ceiling of my own chess playing Pascal program that performed a pretty good six level look ahead algorithm.

But it was entering the hardware history wing that really sent me down memory lane.

PDP8s, PDP11s, original IBM PC, Osborne, Apple Lisa, Apple IIc, Mac 128k, Compaq luggable, Apple Powerbook 170 and 230 with that cool ejectible enclosure that hooked up all your cables for you. Wow!

I even saw an IBM 5100. Perhaps the most bizarre machine I ever programmed. It has a switch that moves the shared program and memory space from APL to Basic - two worlds that should never co-exist.

When I was at IBM in Boca Raton I wrote an inventory management system on a 5120 the predecessor with a 9 inch screen!

If you’ve never been to the museum you really should go. Take your kids. Show them the progress technology has made during your adult lifetime and let them dream about the next 25 years.

Where else can you sit on the built in sofa of a Cray 1 supercomputer and see a PDP1 still working to play the world’s first video game?

Thanks to all the sponsors for hosting the event and selecting Splunk as the fastest growing company in Silicon Valley!

The Award - Where’s the cash?

Splunk Founders - Erik, Michael, Rob

How Many Can You Remember?

PDP8

PDP11

Cray 1

Splunk Lab in Asia Launches to Develop New IT Search Apps

The last two weeks I’ve been traveling throughout Asia with our new partners at Systex and the Splunk Asia team. In Singapore, Hong Kong, China and Taiwan we met with government agency, high tech manufacturing, insurance, online gaming and managed service provider customers who told us how critical Splunk is to their IT organizations, especially as budgets get even tighter.

Systex is now our master distributor covering Taiwan, China, Hong Kong, Singapore, Thailand and Malaysia. Systex is an amazing company fueled by Taiwanese entrepreneurship, creativity and innovation. The company is part distributor, part reseller, part system integrator and part independent software developer. The 2,900 Systex employees are led by CEO Hilo Chen and COO Frank Lin. Hilo did a stint at Yahoo! Asia before joining Systex as CEO. He is a very friendly, engaging and good nature executive who commands the passion of his team. Frank is detail oriented and intense and he has an ability to focus on what seems to be the impossible and get it done.

I’m not used to people pushing faster than I do, but the Systex team are reminding me what start-up speed is all about.

The Systex system integration and software business is fueled by more than 1,400 engineers with deep domain expertise in financial trading and banking systems, network security, database administration, storage, virtualization, disaster recovery, IT service management, telecommunications OSS/BSS, unified communications, business intelligence and more. This past week we unleashed the creativity of more than 400 of those engineers, product managers, sales personnel and business unit heads. We met at a three day kickoff event for the launch of a joint Splunk Lab designed to come up with new areas to apply IT Search and new Splunk Apps for a variety of use cases.

It is our hope that our joint work together will result in lots of new Apps available for download by Splunk users all over the world.

The event started Thursday with a press conference at the Westin in Taipei. We were joined at the press conference by more than three dozen press covering innovation in Asia. We discussed the design of the partnership, the Splunk Lab and some of the joint customers including Allianz Insurance, IAH Games, and The Malaysian Prime Minister’s Office. Allianz is using Splunk to report on F5 Big IP load balancer activities. IAH is mining their online multi-player game events and logs for insight into user patterns and activities including market basket analysis across different game properties. The Malaysian PM’s office uses Splunk to secure their email messaging system.

The press asked some very good questions about various use cases and our strategy for accelerating activities in Asia with Systex. Richard Tang and Johnny Lin attended the event from Systex as well and provided a great overview of how the Splunk Lab is coming together and what kind of solutions Systex is creating around Splunk. Richard has been very patient with me and has taught me enough Mandarin to completely embarrass myself during my last few visits.

On Friday 260 engineers and product managers attended an all day Splunk Boot Camp at the Systex UCOM training center in downtown Taipei. The day was divided into two three and a half hour sessions. Each session covered using, administering and deploying Splunk. There was a brief section on developing Splunk Apps including building of a network management application.

One of the product managers commented to me at the end of the day, “My mind is broken on Splunk, there is so much you can do with it.”

Saturday’s session was the Splunk Lab kickoff event and creative activity attended by 300 business unit heads, sales people, product managers and field sales engineers. I was amazed. We went from 8:30am to 6:30pm on a Saturday. The level of energy was unlike anything I’d ever experienced before. Taking the long trip back from Taipei by way of Tokyo, I am just in awe at how two organizations half a world a part have so tightly bonded in just six months. I’m very impressed by the Taiwanese work ethic and dedication.

Kord Campbell, Splunk’s Director of Developer/ISV program gave a great talk on developing Splunk Apps to start the working round tables. Each business unit (twelve in all) spent three hours coming up with ideas for Splunk in their unit including what Splunk Apps they were going to create and which customers they were targeting. The areas included

  • Financial Trading Platforms
  • Banking and ATM Systems
  • Database Serivces
  • Information and Security
  • Business Continuity and Disaster Recovery
  • Customer Service
  • Data Management & Integration
  • Unified Communications
  • IT Service Management
  • Education & Training

Teams were judged on several factors including creativity, feasibility, significance to current business and target customer profiles.

The winning team didn’t use slides but instead acted out their presentation in a 15 minute skit. It was wild and reminded me of how dysfunctional most IT organizations are today. Not that we needed reminding :-)

The Financial Services Business Unit was judged the winner. This team has developed market trading platform software in a joint venture with Reuters and explored using Splunk with their quotes and trading solutions and for market compliance. The first scenario involved monitoring TAIFEX, TWSE and OTC trades and examine patterns indicating potential fraudulent activities.

The second scenario showed how IT Search can be applied to troubleshooting the electronic system including buy side, sell side, cash position, web interfaces, trading systems and risk management. Actors in the scenario ranged from investors, web infrastructure managers, dealer groups, trading managers, CRM users and back office personnel. The team called their solution “A Lighthouse in the Dark.”

Perhaps the most interesting integration of Splunk though was the mining of data from the web application platform to determine which features users tapped into and which ones they tried once but never went back to. By examining page views for new functions and correlating those with trade volume deltas the team can continuously monitor the revenue effects of application and site changes.

The Splunk Lab launch has us thinking about how to get other people collaborating to build new applications for IT Search. We’re planning to launch a public site soon that will allow domain experts from all over the world to work together and create great Splunk Apps. So we decided to take the elevator to the top floor of Taipei 101, the world’s tallest building to look for more…


Top Floor at Taipei 101


View to the East of Taipei

Press Conference


Frank Lin, COO, Systex


Me


Robert Lau - Splunk & Emy - Systex


Hilo Chen, CEO, Systex


UCOM Technical Training Center

Kord Campbell - Splunk


Splunk Lab Team Competition


Winning financial services App


A little bit of fun

Taipei 101 - World’s Tallest Building

Splunking Across the Pond. Welcome Brian Haynes VP EMEA.

It’s kinda a funny story and although it seems so long ago it was just 18 months ago. I was traveling in Europe starting to talk with potential customers who had downloaded and installed Splunk (3.0 variety). My very first meeting was with a guy name Scott Davies VP of E-commerce Trading Platforms at Royal Bank of Scottland in London’s Bishop Gate. I had the opening slide to our presentation up when Scott walked in the room. He was very polite, asked us if we wanted some still or sparkling water and wanted to know how our trip was progressing thus far. Finished with the pleasantries he than quipped, “I love your product, but when are you going to change your name.”


Seems “Splunk” didn’t quite translate all that well in the UK. Although Colin Barker and Steven Arnold didn’t seem to mind. Fast forward to October 2008 and here we are with more than 60 customers in Europe including several major banks, telecommunication providers and large enterprises. And now we have a big shot head of EMEA and an incredible team on the ground in London. Welcome Brian Haynes!

I first met Brian about three months ago at the Berkeley Hotel in London. We hit it off immediately. Brian was incredibly excited about our free download model as he had experienced similar success with companies like Legato that initially followed a simlar model. The difference he said was, “Splunk really believes in fostering a global community of users around its product, something Legato never had.” As our new Vice President Sales for EMEA, Brian will no doubt help us really accelerate our growth in the European market. He joins us at a great time. Last week we attended the IP 08 show and our booth was mobbed with folks anxious to learn how they can Splunk their infrastructures.

As the global economy continues to crumble its amazing to see that we’re able to keep bringing value to customers around the world and grow our user and customer base by helping IT organizations do a lot more with less. The notion of a single universal platform that breaks down the silos between operations, security and compliance will certainly continue to thrive.

Splunking VMware virtualization at VMworld

This week things were rocking and we were splunking at VMworld. VMware launched their road map for their Virtual Data Center Operating System (VDC-OS). VDC-OS is VMware’s vision to aggregate virtualized servers, storage and network resources into a common platform that manages resources for guest operating systems and applications. And we launched Splunk for VMware. It’s an application build on top of Splunk that gathers data from from different levels of the VMware virtual stack including the hypervisor configuration, metrics and events, the host operating system, underlying network and guest OS and applications. The application also gives you predefined searches, alerts and reports to troubleshoot and secure your VMware environment. It’s free and you can download it here.

VMware VDC and Splunk for VMware

VDC-OS represents a big leap forward in managing the complexity virtualization hoists upon us. Finally vendors like VMware and Microsoft (will soon ship their own System Center Virtual Machine Manager) admit managing complex combinations of virtual resources is difficult and important. This is great for monitoring the hypervisor and virtual guest sessions, but what about the resident guest operating systems or applications? Its still impossible to correlate activity and performance at an application level with resource utilization and performance down to the bare metal

While these vendors are focused on deploying and tracking the resources themselves, Splunk focuses on providing visibility into the complex interactions and dependencies within a virtual infrastructure. Splunk finds, collects and persists the otherwise perishable log, event and configuration data from dynamic virtual instances as they come and go. Splunk correlates data across tiers in the virtual stack — both inside and outside the hypervisor and guests including the physical servers, hypervisor, VMs, and deployed applications,.

When you point your web browser to the Splunk for VMware application you’ll notice several dashboards already created.

  • VM Metrics Dashboard - a view of the last hour’s memory and CPU utilization across all running VMs so you can pinpoint hot spots.
  • VM Status Dashboard - current configuration, available storage and other key status indicators from different tiers including hypervisor; access & weblogic logs from deployed applications within the guest OS; perfmon, ps and top from the guest OS’s.
  • VM Searches Dashboard - all searches, alerts and reports included with Splunk for VMWare.

You’ll see on the searches dashboard a number of investigation searches that correlate the VMWare API data with OS data from within the guests to perform complex investigations in a single step. This dashboard also shows you the details of predefined alerts like looking for guests with heartbeats, looking for storage capacity problems, and other common issues.

As concepts like VMware’s VDC-OS become reality (some time in 2009 according to VMware) having the ability to trace transactions through a virtual infrastructure will become even more important. Every layer of management and abstraction (and yes that’s what virtualization is) means more complexity to manage. Just as with previous VMware products, VDC-OS will not manage physical hardware that has not been virtualized. And understanding how the virtual infrastructure is interacting with non-virtualized servers, storage and networks will remain a critical requirement.

Check out Splunk for VMware and let us know what you think and how we can continue to build on it together.

Splunk in the fast lane. Welcome Godfrey!

Things are moving pretty fast at Splunk and I wanted to comment on the exciting news we announced last week.

In 2004, myself, Erik Swan and Rob Das started Splunk with a vision to battle IT complexity by embracing it. We were thinking of things a bit differently. A different way to address the management of IT by applying search to millions of data center artifacts. Traditionally these artifacts were summarized, filtered and reduced and then forgotten - leaving us humans in a pickle when we needed to figure out what’s really going on. For us Splunk was also about a different way to interact with the market taking an approach of utter transparency. Our public product road maps, freely downloadable software and straightforward marketing had even our early stage venture capital investors thinking we were crazy.

By start-up standards, we seem to have succeeded. Splunk now has more than 250,000 user downloads, more than 750 enterprises, service providers and government agencies worldwide as paying customers and a growing list of partners who embed Splunk into their software, hardware and managed services including companies like Cisco and British Telecom. According to my venture capital friends, very few start-ups make it to where we are today. But, fueled by a love for innovation and so many passionate users we’ve challenged ourselves to see beyond achieving success as a start-up. We believe Splunk can be a company that gets the IT industry thinking differently.

Creating change isn’t easy and we’ll need all the help we can get. Fortunately, we’ve been blessed with an ability to attract top talent at all levels. But our most recent success tops them all. Godfrey Sullivan has joined us as our new President and CEO. When you meet him you’ll realize the incredible passion he has for building great companies. Most recently he was President and CEO of Hyperion Solutions. He took Hyperion over a period of six years to $1B in revenues. Hyperion was acquired by Oracle in 2007 for $3.3B. Godfrey also serves on the board of directors of Citrix Systems, Inc., and Informatica Corporation. Just as important as his business and leadership abilities, Godfrey has the cultural DNA that fits right in at Splunk.

Here’s the yin and yang that is Godfrey. He owns one of only 4,038 1994-1997 Ford GTs. Now this thing is fast, really fast.

  • 0–60 mph (0–96 km/h): 3.3 seconds
  • 0–100 mph (0–160 km/h): 7.3 seconds
  • Standing 1/4 mile: 11.2 seconds @ 134.2 mph
  • Top speed: 212 [11]

And his other car is a Toyota Prius. Enough said.

Godfrey couldn’t join us at a better time. We’re scaling all aspects of the business and need the leadership of someone who’s been through this type of explosive growth before. For me personally, it’s pretty cool to work beside someone of his experience, talent and steady as she goes outlook on life.

And I get to continue to do what I do - build things. I’m now leading the team building our partner ecosystem working with Developers, MSPs, Resellers, Technology Partners and System Integrators around the world.

Of course this hyper growth wouldn’t be possible without your passion and support. Thank you all for that.

Happy Splunking!

Life after SIEM. Situational Awareness is next.

We’ve been hearing a lot lately about the death of SIEM technologies. But isn’t the question less about a legacy technology dying and more about the dimensions on which the next mass adopted security capability will be born? Clayton Christensen first described a model for disruptive technology in his book The Innovator’s Dilemma and his follow on The Innovator’s Solution. Christensen describes a theory about how disruptive technologies over take sustaining technologies by delivering value on new dimensions that established vendors overlook as unimportant, low end or just don’t think about because they’re too busy improving their legacy. Christensen’s work offers an interest framework to think about what’s taking place in the market for SIEM security management solutions.

Any enterprise trying to secure their IT infrastructures knows the state of the art in SIEM security approaches falls short. And trends like virtualization are making things even more difficult. System and security administrators and analysts are inundated with too many potential incidents and its too difficult and time consuming to investigate even a fraction of them. Achieving a greater comprehension of the meaning of potential incidents and the projection of their status in the near future is the real goal. The idea, called “situational awareness” is often, however, impossible to achieve. We are so dependent on pre-programed rules in our SIEM solutions that we lack the ability to perform our own analysis because the original raw data has been filtered out, thrown away or we have no practical way to make sense of it.

Observation: If the technology is sufficiently complex as to allow the vulnerability to exist, can we really build complex technology to catch all the possible issues or scenarios?

As a reference point see David Hazekamp, Security Architect at Motorola, talk about the importance of retaining all security data across the Motorola global SOC infrastructure and integrating access to all this data into existing SIEM solutions.

Of course reaching this understanding requires one suspends their disbelief about the effectiveness of current SIEM security technologies. Usually this means you’re not a vendor or you’re a vendor with little or no vested interest in current approaches. So with this let’s examine the typical enterprise deployment of security technologies.

Defense in Depth

This is where every good enterprise security architecture starts. In order to begin securing your environment you’ve got to have data, raw data. In most data centers this takes the form of syslog from network devices and servers, SNMP traps, OPSEC or LEA interfaces for firewall events, WMI for Windows desktop and server events, IDS and IPS signature scans and application level firewall examination of common services like FTP, HTTP, SFTP, SCP etc. The thinking is you need to look at everything. Perhaps you’ll even want to pull in information from physical security systems like badge readers.

Security Information Management (SIM)

The next step in the process is to manage all this raw data and filter it down to a manageable number of events, traps and alerts. Collecting, storing and providing some basic analysis on all this data is the job of a SIM. Typically, as Raffy points out, the data is parsed, normalized and stored in a structured RDBMS. Parsing, normalizing and structuring all this data is great if the data doesn’t change or you don’t have too much of it. But if you’re dealing with data formats that aren’t static or you’re trying to store terabytes of this data an RDBMS won’t be your friend.

Security Event Management (SEM)

Once a SIM has done it’s job you’re ready to aggregate, correlate and start reporting on potential incidents using a SEM to do the job. SEM’s usually consist of lots of rules that look for combination and patterns of events indicating that a possible attack or breach may be underway. Essentially the SEM rules attempt to codify what we humans know about vulnerabilities in our IT systems and possible ways to exploit them. The goal is to provide some real-time information usually in the form of reports, dashboards and visualizations to operations and security analysts who work to keep the infrastructure secure.

Situational Awareness (SA)

SIEM correlation can be interesting for discovering a pattern or related event but the ability to work an issue outside of these “canned” rules and events becomes the real problem. Unfortunately, what all to often happens is there are so many possible attacks, operations and security staff are overwhelmed with potential incidents to investigate and not every event or pattern of interest is going to be discovered via the pre-built rules. Situational awareness is the attempt to perceive environmental elements within a volume of space and time. Comprehension cannot be achieved if the data being bubbled up is filtered according to a set of rules and the technology does not allow a human to perform their own analysis of the raw data as generated by the environment itself. All technologies have their weaknesses and those that perform correlation are no different.

Thus whilst canned SIEM correlation provides value in bubbling things up — we still need the ability to dig into the raw data to fully perceive and comprehend what is taking place. Now mind us all SA is not a new concept. It has been applied rather robustly by decision-makers in complex, dynamic areas from aviation, air traffic control, power plant operations, military command and control — to more ordinary but nevertheless complex tasks such as driving an automobile or motorcycle. And yes it has been mentioned before in security operations, particularly in government agencies.

Man Versus Machine: Part One

Recently I gave a talk at the BT annual technology gathering. The setting was a really beautiful estate called The Grove just north of London in Hertfordshire England. A couple hundred of BT’s smartest technology managers were in attendance and I was supposed to think of something to hold their interest for an hour. I got to thinking about all the technology and infrastructure BT must have and how in the world do they manage it. I started gathering data. With internal growth, new projects like BT’s 21st Century Network and acquisitions over the past decade through BT Global Services outsourcing contracts the company has a lot of IT infrastructure.

  • 74 data centers,
  • 163 countries,
  • 3,000 applications,
  • 6,000 different types of systems/devices and
  • 17,000 IT staff (6,000 BT and 11,000 outsourced).

I also spent a few hours with some of BT’s brightest architects who are working on attempts to virtualize every layer of their infrastructure — network, storage, database, application, web servers, VoIP, collaboration, ordering, billing, provisioning, monitoring etc. What’s their biggest problem I asked. Resoundingly it was “our customers are still often the ones that tell us stuff is broken.” This was so reminiscent of my time at places like Yahoo! where we’d have these 7×24 war rooms during key outages and the daily conference calls with 30-40 people on the line all emailing logs and configurations to each other.

As our IT infrastructures become incredibly complex, dynamic, service oriented, virtualized and mission critical we’re confronted with this battle raging in our data centers. And it appears the machines are winning and the humans are losing.

Our biggest problem is figuring out — did something go wrong? Why? Where does truth lie? According to market researcher IDC In 2007 > $140B spent managing the world’s data centers. IT OPEX is growing at 2.5 times the rate of hardware spend and 1/3-1/2 of TCO is spent recovering from problems. The cost of availability now dwarfs the purchase and maintenance cost of technology.

So what have we as an IT industry done to address the problem?

We’ve created concepts like ITIL and CMDBs. While there are some good processes improvements here for sure, these top down modeling approaches and pre-determined rules only tell us what we already know. In my experience it is not the things we already know about that bite us in the ass and take our systems down for prolonged periods of time. It’s the multitude of unanticipated and unavoidable dependencies and interactions that take place in an complex system. And it’s impossible to know what set of dependencies and interactions will cause downtime until it occurs. Our infrastructures are just too indeterminate. That’s the point after all. Tier it, load balance it, virtualize it. So we don’t have to worry about the dependencies and interactions among all the different components. Well guess what? We do have to care. Because we have to fix it when it goes wrong.

Take the analogy of a complex air traffic control system. Sure the air traffic controllers feel really great when they arrive at work in the morning. They’ve got their coffee, flight plans and a good handle on the early morning inbound and outbound traffic.

flightplan

Then the day gets a bit more challenging. Weather conditions over Chicago backs up landings at O’Hare. A baggage handler and mechanic strike slows down JFK departures. A pilot radios he’s three degrees north over Pennsylvania but where is he really? Now you need radar. Throw the flight plans out the window. You needs to know what’s actually happening now.

radar

So how do we establish the equivalent of radar for a complex IT infrastructure. Component monitoring doesn’t work any more. If the problem is a single component failure, we already know about it. We’ve already automated the swapping in of a new machine or device. And we can reboot software components automatically. IBM’s has their own marketing play on this called “Autonomic Computing” but that too seems to only focus on the simple single component issues not the indeterminate chaos that ensues in a real running system. And it seems like more slideware than real solutions.

In my next post I’ll tackle the issue of how we might look at things differently.

Stay tuned.

Splunk Live Southwest 2008

This week we’ve been moseying through the Southwestern part of the US with our Splunk Live show. We changed up the format a bit with Splunk technical workshops in the morning and customer round tables in the afternoon. The technical workshops were a big hit with more than 200 people registered to engage with our Splunk Experts. During the workshop you were able to download, install, configure and start using Splunk on your laptop or server with remote access. The best part about Splunk Live events though is sharing ideas with other Splunk fanatics.

Ryan Peterson from Infusionsoft, a marketing automation company, gave a great talk in Scottsdale about his Splunk deployment for the company’s email infrastructure. Ryan is tasked with keeping more than 12M emails a week flowing out of the system to support Infusionsoft’s Automated Follow-up Technology (AFT). Ryan has multiple servers in different geographies in addition to PCI Compliance requirements. He demonstrated using Splunk to troubleshoot problems spread across the messaging infrastructure, address reporting inaccuracies and deliver PCI reports to auditors. He’s even indexing the content of email with Splunk using a scripted LDAP data input. Cool stuff.

In San Diego Tony Doan of the Genomics Institute at the Novartis Research Foundation (GNF) and Eric Van Johnson from Sony Consumer Electronics joined us. Tony is a security engineer and former pen tester. He also confesses to be a recovering Unix sysadmin. GNF has 600 Windows desktops and several hundred Windows and Linux servers supporting the discovery of new biological processes and improved human therapeutics. Tony discussed how they splunk Cisco CSC, Bluecoat, Symantec AV, Arpwatch, Cisco Switches and Wifi access points to find what he calls “previously unknowns” to improve operational availability and security. He says they’re finding new uses everyday but Tony’s favorite is splunking Cisco IPS and Cisco MARS events looking for odd behaviors. Next up for GNF is eating Windows Event Logs and Windows Registry inputs together with summary indexing for consolidated reporting.

Eric Van Johnson is the eServices Hosting and Operations Manager at Sony Consumer electronics. He led an great discussion on splunking IBM Websphere and MQ Series events including how Sony has integrated operations and development environments to identify problems with complex apps more quickly and avoid unnecessary escalations to the development team. He shared with us Sony’s roll out of Splunk to their Business Intelligence Group. The idea is to complement aggregated WebMethods data reporting for business activity monitoring. Next up he wants to feed Splunk data back and forth with Verizon’s hosting operations since some of the Sony servers are hosted at Verizon and Verizon is also using Splunk.

In LA Rich Horace, Director of Systems Engineering and Operations at Fox Interactive Media demonstrated how Fox uses Splunk in the Fox Audience Network. Basically these are the guys that serve web advertisements across all the Fox properties including MySpace, Rotten Tomatoes, Fox Sports and IGN. He’s challenged with launching new monetization platforms and keeping the existing ones running. Rich gave a fantastic overview of his Splunk installation which consolidates/aggregates data form disparate systems in order to protect against hackers and meet PCI and SOX requirements. He currently runs an environment with ~600 Linux servers, load balancers, servers, NetApps and network switches. So far he’s indexed 1.5B events. We engaged with everyone in a lively discussion about securing production sites from developers and controlling and auditing access to data using Splunk’s access controls and search filters. Rich also discussed how Fox is using Splunk to integrate with various Citrix products including Netscaler and XenApp.

Thanks to everyone who shared their stories with us this week, it was really awesome.

Splunk Developer Camp 2008

It’s Sunday night before the start of our first ever Splunk Developer Camp. Never before have we invited developers from our community at large to participate in sharing their ideas about building Splunk Apps and learning about all the cool stuff in our upcoming releases. I think I can speak for everyone at Splunk when I say we are truly amazed with the level of interest and participation. We’ve had to move the venue three times now to accommodate the growing list of participants and while we initially expected the mix would be mostly existing customers, we’re really pleased with the mix of developers coming tomorrow.

  • 125 Developers
  • 91 Organizations
  • 26 Industries
  • 9 Countries

Only a third of the developers showing up are customers. The rest are system integrators, MSPs, OEMs, ISVs and VARs.


Post Camp Update

We’ve organized the day into a combination of an un-conference format with developer round tables, sneak peaks of future versions of Splunk, demos, demos, demos from customers and partners and training on the Splunk API and SDKs. Our goal for the day was to both educate campers on how to effectively build Splunk apps and to get everyone jacked up about the possibilities. We broadcast the sessions live on Splunk TV.

The day started with a quick intro by me. I gave everyone a brief Splunk history lesson of the past five years and demos of the Splunk for PCI and Splunk for Server Virtualization applications. I wrapped with a discussion of our strategy to seed Splunk everywhere and to enable developers to distribute their applications to Splunk installations around the world in the near future. More on this in a future post.

Erik Swan and Rob Das, my two co-founders followed with a more in-depth evolution of Splunk chat which many focused on all the weird prototypes and company names we thought of before the real Splunk. Some of it is funny and some down right scary. Amazing what guys out of a job can come up with.

Konfabulator Follow Along

Next up Kord Campbell, Director of our Developer Program gave an overview of agenda for the day and reviewed how to register with the Konfabulator and follow along with the many demos up on our SplunkLabs EC2 server at Amazon Web Services. This worked great as everyone could build and run the demos on their own EC2 instance. Kord also showed off the new Splunk Wiki for developers and application users. We’re in the process of moving all our documentation to the wiki as a one stop shop for information on using, administering, deploying and developing for Splunk. A few other Kord matters included the review of our new Developer Program additions including a 2GB Developer Enterprise License for registered developers.

Splunk Apps

Jef Bekes, our Head Designer and Raffy Marty our Application Product Manager then gave a very inspiring talk about the future of Splunk and Splunk Apps. The basic point being in Splunk 3.3 today there is no sense of application context. This means the same default user-interface for all applications and that all knowledge (saved searches, alerts, reports etc.) is shared across all installed apps. It’s impossible also to “switch” from one app to another. Splunk 4.0 attempts to address this whole problem by making applications first class objects that can be containers for collections of other objects at the interface, knowledge and configuration layers. As more an more Splunk applications arrive on the scene this encapsulation becomes increasingly important. Jef and Raffy showed a sample Splunk 4.0 Help Desk application that included custom branding, restricted task-based navigation and structured search user interfaces and results views. Other Splunk 4.0 features were reviewed too; Splunk Web gadgets, the Application builder, improved charting and content grouping.

Developer Platform and API

The Splunk Developer Platform futures was up next with Tom Donahoe, Splunk Product Manager and Johnvey Hwang Lead UI Developer. Topics included the Splunk 4.0 improvements like Application Builder, REST API Additions, UI Extensibility and SDK Support. The Application Builder eases application creation and packaging dramatically improving the experience beyond where Splunk 3.3 currently stands. The Application Builder will be available in both command-line and GUI to provides application configuration isolation and leverage file system security controls. Johnvey reviewed with us planned REST API additions for 4.0 like

  • Alerting: history, status, improved generation
  • Notifications: email, RSS
  • Search scheduling management
  • Knowledge management
  • Authentication: users, roles, single sign-on
  • Distributed: topology data, server metrics

Splunk Ninja

The Splunk Ninja (aka Michael Wilde) graced us with a visit and showed off his demo Godness with a Zero-to-Lightspeed set-up and data eating with the new Splunk Crawl feature in 3.3. Sweet!

Search Language

David Carasso, a Senior Developer and Alex Raitz one of our Solution Architects did a fantastic overview of the Splunk search language and ran through some really cool examples of powerful stuff like

  • What’s the most important hard disk error on each of my hosts?
  • Who sent me the most email?
  • How long do users stay on my website?

David showed us how to create our own search commands too. Awesome stuff.

Large Scale Reporting and Summary Indexing

Steven Sorkin, Head Indexing Geek led a wonderful talk on large scale reporting using great examples like finding violations in security data on application layer firewalls and routers. He covered how we use map/reduce models to summarize batches of events - what we call summary indexing. It turns Splunk into a sort-a time slinky.

REST/ATOM API and Splunk Gadgets

Ode to Log Management

I love “log management.” I hate log management.

I love log management because years ago it was the impetus for IT to move beyond simple SNMP monitoring to collecting and trying to understand a much richer set of data about complex environments.

I hate log management for over the years it has been co-opted by vendors and analysts who’ve pigeon holed it into yet another IT management silo. These vendors and analysts have narrowly defined log management as the collection and storage of logs in some locked repository used to generate static reports to satisfy regulators, auditors and IT governance boards.

Why am I so bitter?

First it turns out logs are critical to many other stakeholders in the enterprise. Operations needs real time access to logs in order to find and fix problems and improve mean time to recovery (MTTR). Security needs logs to catch bad guys. Business people need logs to understand customer and service behavior and provide service level measurements. So locking up logs in a static repository designed for one constituency severely limits their value and diminishes the return on investment not only in a log management solution but also the return on your IT assets overall.

Secondly logs alone don’t provide anyone of the IT stakeholders with a complete picture.

Let’s take a simple example right from the hottest compliance use case today — PCI. The Payment Card Industry (PCI) Security Standards Council founded by American Express, Discover Financial Services, JCB International, Mastercard and Visa has outlined requirements for security management, policies, procedures, network architecture and software design. If you are a merchant accepting credit or debit cards and you process more than 20,000 transactions per year there are twelve specific requirements. Failure to comply with the requirements is not an option. You can be fined heavily and you can lose your ability to accept credit and debit cards.

One of the twelve requirements is the commitment to monitoring and investigating changes to configuration and password files for any application, server or device involved in the processing of card holder information and transactions. In the case of file content, permissions or attribute changes, logs will only tell me part of the story. Yes a Windows, Linux or Unix log will tell me a file has been changed but it won’t tell me who changed it. It also won’t tell me if the change was authorized or not. To understand who changed a file I need to look at the other user processes running on that server at the same time the file was changed. What user processes were running and who owned them? In Unix or Linux this information is easily viewed with a simple “ps” or “top” command but doesn’t exist in any log. In order to understand if the change was authorized or not I need to compare the log and file change information with the user information and any tickets from the service desk authorizing this user to make this type of modification.

The real reason I believe we need to move on from talking about log management is log management isn’t a market. It isn’t a solution. It is a feature in a much broader landscape of harnessing all the data being generated by our IT infrastructures.

Turning all that data info information for every stakeholder is important to the future of IT as environments grow more complex, dynamic, service oriented, virtualized and mission critical. Not just to report on compliance controls, but to improve our speed of root cause analysis, increase our ability to quickly and comprehensively investigate security attacks and develop more intimate relationships with our customers by better understand their behavior and providing a transparent view of the services they are receiving in return.

The Consumerism of IT

Recently Matt Asay wrote a thoughtful piece about how some technology companies are consumerizing the computing experience. In the case of Apple, Business Week writer Peter Burrows has also recently wrote about The Mac in the Gray Flannel Suite exploring how CIOs are testing the appetite for Macs in the enterprise. Michele Goins CIO at Juniper Networks recently ran a test among the company’s 6,000 employees discovering that 25% wanted a Mac.

Consumerism of the enterprise computing experience is well underway with Apple, Google, SalesForce and even Cisco’s TelePresence and WebEx offerings. According to Matt, all of these products delight users with a positive user experience by focusing on adoption first and dollars second. “Simple, fast and useful,” is the key.

Could it be that the consumerization of IT is far behind? How many enterprise management vendors focus on adoption first and dollars second? Can you honestly say that any of your vendors put you and your users first? Do the words “simple, fast and useful” come to mind as you’re writing the check for your maintenance renewal every year?

customersatisfaction.png

We recently compiled the feedback from out Q1 customer survey. Each quarter we survey our customer base like most companies do. What’s perhaps different in our case is we focus intensely in our surveys on the user experience with our product. We ask about ease of use, administration, upgrade processes and documentation quality. What we continue to find is users and customers actually like using Splunk versus being compelled to use it by their organization.

Maybe we’re participating in the consumerization of IT. Perhaps we just like using the stuff we build. Regardless, we are constantly working to improve the Splunk user and administration experience. To us this is the #1 measurement of our and our customer’s satisfaction. You may already know we post our product roadmap on our website including where we’re focused for the next several months. If you have your own ideas send us your feature and improvement suggestions directly to Splunk support.

New Splunk Apps Launch at Interop and MMS

logo_interoplv2008_large.png

logo_mms_large.png
This week we were rolling in Las Vegas with Interop at one end of the strip and the Microsoft Management Summit at the other end.

At Interop we launched the Splunk for Change Management app. And at MMS the Splunk for Windows Management app made it’s debut.

Both apps make use of the Splunk Platform which provides a common set of services and APIs making it easy to create and integrate applications that leverage vast amounts of IT data. These are the second and third applications in a series of new releases we’ll be doing this year.
Splunk for PCI was the first app launched last quarter.

Splunk for Change Management App

Splunk for Change Management takes advantage of the fact that we index not just logs but configurations and file system changes as well. It also leverages a little known (but I think soon to be much more popular) Splunk search command called diff. Diff lets you easily compare two search results and returns a single result that is the different between the two. You can compare values of specific fields of results as well as every line of multi line events and files. This makes it really easy to compare configurations across lots of locations. Splunk for Change Management leverages these capabilities and brings integrated change audit, change detection and change validation.

Now your can detect unauthorized changes by indexing your trouble tickets and ticketing system logs together with your service, device and application events and configurations. We use Jira internally and find indexing our Jira tickets enables us to immediately know if a change was authorized or not. No more jumping between redundant and siloed consoles searching for the answer or writing all kinds of complicated data transformation scripts to compare the output of different management systems.

And for the first time we introduce to the industry the concept of Change Validation. Today many of us have the ability to blast out patches to hundreds of servers and device automatically. But how do we know that the changes had the desired effect? By observing the state and events generated by the actual patched systems we can now compare the before and after actual behavior. Splunk brings change audit events and configuration data together with activity and error logs so you can connect change with actual system and user behavior.

The app includes:

  • Out-of-the-box dashboards with over 40 reports showing changes across all datacenter components including applications, servers and network devices.
  • Predefined alerts that detect unauthorized change on the basis of configuration variances and correlation with service desk systems.
  • Predefined searches to help identify service-impacting changes quickly.
  • Integration with service desk systems to close the loop on change management by validating the effect of change on system behavior.

Splunk for Windows Management App

This new app integrates Microsoft’s System Center Operations Manager’s command-and-control view of a Windows infrastructure with Splunk’s IT Search. The latest version of Splunk now indexes all IT data generated by Windows servers and applications — event logs, registry keys, performance metrics and application log files. Everything is searchable from a single place to resolve service-impacting incidents faster, enhance monitoring coverage, and validate service levels.

What’s really cool is Splunk searches can be launched through Tasks in the System Center Operations Manager Console on any aspect of the infrastructure being monitored, and can be expanded to include far-flung elements of the IT infrastructure for additional context – regardless of platform or technology. Its super fast to identify information across the Windows Event Log, the Windows

Splunk and US Federal Government Agencies

foselogo_large.png This week we’re at FOSE 2008 demonstrating how we’re collaborating with US Federal Agencies. A number of agencies have already joined the Splunk community including:
  • Executive Office of the President
  • Federal Bureau of Investigation
  • NASA
  • Social Security Administration
  • US Department of Agriculture
  • US Department of Defense
  • US Department of Energy
  • US Department of Homeland Security
  • US Department of Interior
  • US Department of Justice
  • US Department of Labor
  • US Navy
  • US Department of State
  • US Department of Transportation

Many of these customers are applying Splunk to extreme applications with large data volumes from many different disparate sources. As you can imagine the complexity of security and compliance concerns, agency interactions and a sophisticated web of outsourcing to federal system integrators provides fertile ground for IT Search as a new way of solving all kinds of problems.

Typically our collaboration involves operations, security and compliance people from both the agency and system integrator sides. Agencies continue with their pursuit to cut costs and outsource while being driven with a host of new projects every year. And system integrators continue to search for new ways to bid more competitively by demonstrating new ways to more efficiently develop, deploy and manage technology. This means the business of managing our nations IT infrastructure is significantly more complex and dynamic than ever.

As an example, the current state of the world demands a serious risk management approach to Federal Government systems. All agencies have implemented some type of security in-depth strategy with firewalls, vulnerability and IDS scans. While these technologies are effective in their particular function they generate a tremendous amount of data making it impossible to get a holistic view. These extreme customer environments generate more data and are more dynamic that traditional system and security management approaches can handle. Traditional database and SEIM approaches just don’t scale.

Our own Bill Hornish, who attempted for decades to implement these traditional approaches at several large agencies has put together a really nice video explaining the challenges of risk management in Federal environments and how Splunk can help.

We’re learning a lot by working with these extreme customers and believe they can teach us a lot about what the rest of the Splunk community will eventually experience when applying IT Search to larger, more dynamic environments in the commercial sector as well.

The Splunk Platform Has Launched

Without a doubt the past week has been the most amazing week in Splunk history. The crazy coast to coast multi-city launch left us all exhausted and electrified. A few of the things that stick in my mind…

First Splunk 3.2 including Splunk for Windows went live on our download page last Saturday and more than 40% of our downloads in the past week have been for our new Windows version. Then Nick Selby of 451 Group wrote an analyst brief on us. He said, “Splunk is awesome: it’s multiplatform, easy to install and easy to use. And with an abstraction layer of logs, configuration files and system messages, traps and alerts, it’s seriously useful.” 451 has a reputation for ripping vendors, so we’re flattered.

Dana Gardner, analyst with Interarbor wrote a very eloquent analysis of our platform launch on ZD Net. “Splunk has created the means to offer developers easy access to that data and the powerful inferences gleaned from comprehensive IT search. That means the data can go places no log file has gone before,” says Dana. Developers are certainly doing some way cool things with Splunk.

I’ve seen a couple of neat visualization applications including this one called Replay. It shows you a live or time lapsed view of your event streams. Here you can see the replay application hooked up to our internal wiki showing who’s doing what over a 24 hour period. Click on the image for the movie.

replay.png

As for our own applications, the Splunk for PCI app drew tremendous interest at our series of Splunk Live events this past week. It’s just one example of how a business person with domain knowledge can package their own Splunk configuration as an application. If you haven’t seen Raffy’s video on the PCI Application, check it out here.

pci.png

We also showed the Splunk for Change Management application as well. Seeing someone touch a file and watching the Splunk dashboard update instantaneously is an awesome display of how flexible Splunk has become. Check out the developer program for yourself and get your goods up on SplunkBase so we can all check em out.

changemgmt.png

What Do We See “Standing on Our Own Platform”?

Recently, Johnvey Hwang wrote a post called Standing on Our Own Platform. He was the first one at Splunk to break the ice and use the “P” word. Now it’s out there. What do we see when we stand on our own platform? While only you and the future will tell us — there are a few things we hope to see on the horizon.

First, it’s our belief there’s a lot of money out there wasted on point products for managing networks, servers, applications … even security. A lot of these systems redundantly collect, transmit and store much of the same machine generated data. Think of the network, storage and administration resources duplicated on all this stuff. By providing a platform where the same IT data can be managed once, resources can be freed for other projects.

Second, none of these products work together. If you’re running a network manager to collect and look at SNMP and netflow data you know it doesn’t integrate with your log management system and of course neither talks to your SIEM, SOA, virtualization or application framework monitoring consoles. Building a dense index of data from all of these tools enables correlation across all your silos of instrumentation.

Third, and perhaps most important, isn’t it frustrating to spend so much time getting a new tool running only to discover, it doesn’t do what you need? Allowing, as Johnvey calls it the “intrepid” sysadmin or the creative developer to build on top of our IT Search engine means you can make Splunk do exactly what you want and share it with others if you so desire.

We’re not just jumping on the bandwagon here. Sure everyone seems to have a platform play. It feels like Web 3.0. Google has the mobile phone thing. Facebook, MySpace and Ning have social networking. Salesforce.com has AppExchange and force.com. For interesting reading on the phenomenon check out Marc Andreessen’s post from a few months ago on the topic.

Everyone here hopes to convince you that the thoughtfulness by which we’re going about this will yield much more than a bunch of hype. Ultimately the goal is to allow anyone to unleash their creativity to devise their own way to use Splunk.

Much more to come for sure. If you have thoughts or want to get involved — let us know anytime.

Doom and Gloom Everywhere But Here

The US economy is heading into a recession and technology spending is in for a steep decline in 2008. So every major prognosticator and news outlet from the Wall Street Journal to the Financial Times would have us believe.

Are these people watching the same movie I am? There are two problems I have with this economic hyperbole. Yes that’s what it is. I guess it sells newspapers and gets people to watch things like CNBC. But boy is it misleading.

First of all, in macroeconomics, a recession is a decline in any country’s gross domestic product (GDP), or negative real economic growth, for two or more successive quarters of a year. Yet nobody that I’ve read is forecasting negative growth. They’re forecasting a potential slow down in growth from the current 3.5% per quarter to 1.5 to 2.5% per quarter. But the news outlets feel compelled to use the “R” word just to get attention. Totally irresponsible.

On to my second gripe. With regards to technology and IT spending, I believe, based on what I see, we are in beginning of a long-term gradual increase in IT spending within large enterprises that started eighteen to twenty four months ago.

Sure the current credit crisis may have a short-term impact on budgets within Financial Services companies, but I don’t see any slow down yet. The major consumer, commercial and investment banks we work with have so many critical, revenue generating IT projects in backlog I fail to see how spending is going to slow at all. The telecommunication sector is finally back on the mend after the post early 2000’s bubble and hangover.

Social media, online shopping and the always on dimension of the Internet have online services and large Internet sites like MySpace and Amazon accelerating software, hardware and services spending just to keep up. And security, privacy and compliance initiatives and mandates have companies, service providers and government agencies increasing spending on these items by some 20% or more in 2008 to try and limit their exposure and risk.

Just a month ago the Financial Times had a great piece entitled “What’s on CIO wishlists?” Here’s a quick summary.

1. Business alignment and strategy
2. Hiring and retaining the best staff
3. IT innovation/new methodologies
4. Security
5. Collaboration technologies
6. Controlling costs
7. Compliance and regulation
8. Virtualisation
9. Customer service
10. Mobility (Green issues came 11th)

Doesn’t look like a slow down to me.

Venture Diaries: Part Three

I’ve written previously about our experience this year raising a $25M Series C round of venture financing. Venture Diaries: Part One discusses why you want to think before you act and investigate who to target as potential investor partners. Venture Diaries: Part Two looks at how to perform your investigation. In this third part, I look at how to handle the horse race that inevitably develops once you get a few term sheets.

For me it all started when the first term sheet came in. Funny how some VCs still use fax machines. I had to go figure out where ours was. In the current seller’s environment (yes that’s what you are, a seller of equity in your company) one thing to keep in mind is your first term sheet will just be a starting point. Expect that it will probably be lower (perhaps significantly lower) than where you want to end up. Also expect once the first term sheet comes in things will really start to heat up. Nobody wants to miss out on a good investment and VCs are just egotistical enough to really help your cause. However, you should realize each VC has their own style. Some will try to move first in hopes of stealing the deal from others. Others will try to wait till the end and trump any offer — figuring the last hand in has the best chance.

This is where the entrepreneur’s job gets difficult. You want to put everyone on notice that you have a term sheet. This way things really get moving and you can quickly figure out who is really interested and who is just playing along. But what process should you use? How do you maintain your integrity when everyone is asking you for information.

The analogy of selling a home comes to mind. Some sellers will run a sealed bid process. “All offers are due on Tuesday by 5pm and the top offer wins.” This tends to work better in real estate because you already have an asking price. Buyers know what minimum price you expect. In addition, most markets have an established bid/ask ratio where homes get sold (unless your in a rapidly declining or accelerating market which isn’t often the case).

When you’re selling equity in your company to venture capitalists the number one rule is don’t, under and circumstances signal an asking price.

You will get hammered by investors wanting to know what your expectation is for your company’s valuation. There is one and only one correct way to answer this question every time. “We believe we’ve made significant progress since the last round, but the market will price the deal.” This way you signal you’re expecting a nice increase over the last round price but you don’t set a ceiling on this round’s price. Trust me they will all ask you over and over and over again, but don’t give in!

Back to process. Sealed bidding doesn’t work. So what does? I call it the Road Runner strategy. Remember how the Road Runner used to always chase Wile E. Coyote to the edge of the cliff and then watch him fall off? images.jpeg

This is what you need to do with each of your potential investors. To maximize your terms and perhaps most importantly figure out what it will be like to work with each of the potential VCs you have to push them to the edge of their comfort zone. While sometimes uncomfortable the process will show you what your potential new board member and investor is really like. Chances are the way they handle a competitive negotiation is the same way they’ll handle themselves in difficult board meetings.

Start out by telegraphing the fact that you have a term sheet to the other investors looking at your company. Be careful not to disclose any of the terms, but tell them it is a competitive offer. If the terms are clean, telegraph that as well. In my case I found it helpful at this point to set a deadline a week or two out whereby everyone must wrap up their due diligence and get you a term sheet. It’s actually a good idea to have a soft deadline communicated in your first meeting with each investor. This way nobody is surprised when you reinforce the deadline. You’re deadline will be soft, but make it seem firm without being pushy.

This is the point where you need to be in constant communication with each interested investor. Return phone calls and emails within an hour. Make sure everyone knows you are available to get them any information they need.

Chances are the VCs will really start selling you at this point. Remember all those tricks Wile E. Coyote had? Most of them some type of Rube Goldberg device manufactured by Acme Corporation. Like the Coyote’s tricks, most of the VC’s points about why they’re the best are somewhat fictitious and sometimes totally outlandish. But none the less they’ll try. You’ll hear all sorts of stories about why you should take a lower offer and how each investor needs to own a certain portion of your company in order to dedicate the time to sitting on your board. Listen attentively, thank them all and then remind them of the deadline and ask them to make their best offer.

Venture Diaries: Part Two

According the National Venture Capital Association (NVCA), there are 798 venture capital firms managing more than $235B in the United States. These are long-term, professional investors who specialize in funding and building new, innovative companies.

So how do you figure out who to approach for funding? This is the area where I find entrepreneurs make the biggest mistakes. Most of us approach investors we know. Perhaps you have a friend who knows a VC or you have a friend who is a VC. How do you know if your friend or the person you get introduced to is the right investor for you? Most likely they’re not. Not all VCs are alike. Some are geared for early stage and some are not. Some are suited for late stage investments while others just say they are.

You can’t always trust what an investor says their appetite is either. I’ve pitched to investors who say, “yeah we do Series A” only to be barraged by questions like, “how many paying customers do you have that we can talk to.” On the other hand, I’ve presented to wanna be later stage investors that were only prepared to pay an early stage price.

You need to do your own research. Venture capitalists are for the most part, creatures of habit. They don’t change investment philosophies much. Often within a firm it will take a generation before new blood arrives and can affect major change. In addition to the succession challenges, VCs are bound by the structure and economics of their business. Venture funds are seven to ten year financial vehicles. VCs raise the money for their funds based on an investment strategy which takes several years to play out.

I suggest doing your own primary research. Identify eight to ten prospects with a track record of backing entrepreneurs like you. Look for a history of focusing on your market and the stage your company is at and the type of involvement you want. Suspend your judgment during the your data gathering. Just get the data and avoid acting surprised or judgmental. Get specific data on the number of projects and stages of investment each firm has completed recently.

When we raised a Series C round earlier this year, I identified eight firms to approach based on their past investment history. Specifically, I was looking for firms and partners that had done a majority of their investments in late stage, infrastructure software companies over the past eighteen months. I wanted to focus on VCs who demonstrated a track record of paying a fair price to invest in revenue generating companies that need capital to accelerate growth. I gathered data on how many investments each VC made, how many of the investments were later stage and how many later stage investments they actually led versus just participated in. My goal was to focus on investors with the highest percentage of later stage deals led as a function of total investments made.

Of the VCs I researched the percentage of Series C or later deals led ranged from 15% to 95% of the total deals invested in during the prior 18 month period. Surprisingly the firm with the 15% invested in far more deals and far more later stage deals than anyone else. But the participation in later stage deals was mostly follow on investments in their existing portfolio. This was not the type of later stage investor I was looking for to lead our financing.

There were two VCs that approached us and pitched themselves as later stage investors. But the data just didn’t support their claims. The one had a 19% rating and the other a 17% rating. Despite showing great interest both of these investors dropped out of the financing process when we had several term sheets and commented, “the price is too high for us, we can’t dedicate our time to the project unless we can own more of the company.” At which point the leopard really showed his stripes.

The core set of later stage VCs I focused on had ratings ranging from 50% to 95% indicating they had led a significant number of later stage investments in the past 18 months. Every one of these investors delivered us a term sheet at a competitive price.

How do you find this information? The brute force way is to visit a number of firm’s websites and go through their portfolios. This takes a while but can yield the information you’re looking for if you put in the time. It is certainly a lot less time consuming (and less humiliating) than pitching investors that will never invest in your profile situation. There are a variety of venture capital databases that can make your research much faster and easier. If you have a friend that’s a VC they likely have access to one or more of these sources. If the answers about a particular firm are vague drill down and get the real story. If you can’t figure it out, move on. You’ve got 798 firms to choose from.

Interop NYC 2007

Last week I was in NYC for Interop 2007. Interop in NY is a significantly smaller conference than the big brother Interop in Vegas. I’d say there were 7,500 to 8,000 people at Interop NYC this year, compared to 18,500 in Vegas back in May. Somehow though I always find the New York show more interesting. Perhaps it’s the lack of constant firefighting in the NOC that gives us all more time to have meaningful conversations about the latest networking technologies. Plus somehow New York just seems to have more substance than Vegas. Call me crazy but…

This was also the first Interop where we had a chance to apply the magic of Splunk genre 3.0. We had a record number of searches in the NOC (despite the smaller show). I’m not surprised. 3.0 is so cool the way it automatically extracts fields out of data streams from all kinds of networking gear.

Now there are lots of people who know more about networking and security than I do, but here’s a simple investigation I did with Splunk.

1. I started with a simple search for “failed password.” This picks up firewall and router hacking attempts (typically ssh) sent to Splunk using syslog forwarding.

2. I was then able to quickly see the top “source IP”. Because the source IP field automatically gets extracted with each search I’m able to quickly click and see the list of top source IPs for the time frame in question. A single click and I’ve added the top offender to my search parameters.

3. Just a click away and I can geolocate this IP. With field actions in Splunk I can now drive workflow items right from the search results. Here I just need to click on the menu next to any IP address and I can geolocate the address with any number of free web based services. It was interesting to watch the hackers and bots travel around the world and with more time would have been fun to write a little Flash application to call the Splunk API and map things in real-time.

4. Reporting on top source_IPs every hour was easy. Like any IT guy without a bunch of time, I went for the low road. I just clicked report on all source_IPs from the field action menu and I got a nice looking flash report. It was really easy to save the report and run it on a schedule every hour. Now anyone on the NOC team alert list can get it right in their email or log into Splunk and check out the dashboard with a few other useful security searches.

null

You can split the same report series by user and see how a lot of these hacker bots try to use common software package and open source default configuration usernames and passwords.

If you want to check it out yourself, send me mail and I’ll let you know where you can access the server. It’s kinda fun to search on your own machine name and see all the times you were on the network at the show. You can drill down into each DHCP transaction and see all the events.

Blowing Things Up

I’m not sure if it’s the start of a new quarter, the full moon or my two seven year old boys that have me thinking about this, but we seem to be blowing a lot of things up lately. A few examples…

1. We blew up our product development process
2. We blew up lots of our software
3. We blew up our business planning process

When I say we “blew ________ up” (enter your own thing here) I mean we decided to take another course of action, look in the other direction, put other people in charge or just plain start over from scratch. Combustibles are exciting for lots of reasons (especially to second graders) but as a new type of business tool?

I’ve written in previous posts about our move to an Agile product development process. This required us to literally discharge our old way of taking input from customers, scoping features, planning releases and testing. Of course it also meant we had to ignite our underlying work flow and tools supporting product development. It all made me a tad nervous : { For more than a month I couldn’t tell you what would appear in our next release or when the release might be available for download. If you use Splunk, you know that we live and die by our product road map and release schedule. During that month our engineering, qa and product management teams went through a metamorphoses. They moved from being top down, planning driven to bottom up, innovation driven. We had reached the point where we couldn’t plan or prioritize features. The old process of having a team set out a plan and working towards a release wasn’t working anymore. So we blew it up. Now we have a process where by parallel scrum teams work on various facets of the product and they do the planning, constantly. It’s interesting how nobody, but yet everybody is in charge. The initial results are just in. Splunk 3.1 will soon be available for download in a mere eight weeks after Splunk 3.0 was posted. And Splunk 3.2 will be released in beta eight weeks from now. That may not sound like much but when you look at the amount of innovation in each release, the speed with which we’re moving enhancement requests from the field into features and the improved quality of each release it appears remarkable from where I stand.

Detonating software is always dangerous. Will it ever come back together again? Were we right about the surface area becoming too large or the architecture verging on too complex? Stay tuned. We’re in the process of blowing up a lot of our software. For example, we’ve realized our past approach to administration just doesn’t scale. Early on we built a nice UI for editing lots of the configuration properties of a Splunk server. But over time our ability to quickly add features outstripped the surface area of the UI. So we’ve been making configuration parameters available in editable configuration files. Now that is all fine and good but it’s not very discoverable and it’s completely out of context with the task at hand when you’re using the product. Definitely a candidate for explosives. Sometime in the near future you’ll see the administrative side of Splunk blasted for a much more scalable, discoverable and in context design we call “search based administration.” This is one small example of how we’re constantly blowing up our software.

Recently we’ve also been lighting the fuse on our business planning process. It used to be we’d have a few days at the beginning of our quarter when each department in the company (sales, marketing, engineering, customer support etc) would get together and have their own planning process. As we’ve doubled in size since the beginning of the year our old way of planning wasn’t working. Despite our completely open work environment (we have no cubicles or offices) communication across groups had slowed to the point where it was causing a lack of effective planning. You guessed it. We blasted it. Started over. Asked everyone what would make for a better planning process. This quarter we started with a full day of conversations. Everyone was invited to run a one hour discussion forum on any topic they wanted. The only rule was you had to publish it a week a head of time and provide a brief description of the topic on our internal wiki. We had 15 discussion forums run by people all over the company. That was it. Our Q4 planning. A bunch of conversations. We’ll see how far it gets us ; )

BTW, I heard someone at Splunk say in response to blowing things up,

“perhaps companies that don’t blow things up often enough end up blowing up themselves.”

Certainly food for thought. I’m keeping my dynamite close by.

SplunkBase Gets a Big Face Lift

splunkbase.jpg

Maybe you noticed, maybe you didn’t but SplunkBase got a big face lift last week. We have a really amazing team of people who have been taking all your input and revitalizing our community IT knowledge base over the last several months. Our goal is to keep plugging away and innovate different ways to enable the sharing of IT knowledge and cool ways to use Splunk. We’re also now eating our own dog food. Splunk support is now using SplunkBase to support our own products and services.

So what’s new?

  • Answers - a large and growing set of answers about Splunk, IT events and different types of technologies that generate a lot of IT data.
  • How-To’s - more in-depth recipes for everything from configuring syslog-ng to how to understand Ruby On Rails logging
  • Events - a library of contributed event types including punctuation patterns, tags, descriptions and more
  • Add-Ons - the beginning of an architecture for sharing all kinds of Splunk goodies. You’ll find downloadable event types, searches, reports, custom data input scripts and configurations. My personal favorite is the OS Monitoring Add-on.

I say add-ons “architecture” because over time we’ll be extending the whole add-ons facility to make it really easy to create and share Splunk configurations and functionality.

I can’t thank enough the incredibly talented team driving SplunkBase forward including Patrick McGovern, Gareth Watts, Dee-Ann LeBlanc, Micah Delfino and Jef Bekes.

Stay tuned for even more SplunkBase goodness to come.

Chaos & Insanity

computerworld.jpg

Last week Splunk sponsored ComputerWorld’s Infrastructure World conference along with HP and IBM. I needed to come up with a talk and I wanted to do something new.

I’ve been thinking about how to describe the challenges we have managing all this changing technology and innovation. Note this is seriously a work in progress. I’m developing a theory that there are three fundamental drivers to data center chaos.

  • expectations,
  • complexity and
  • accountability

Any new business or consumer technology can be quickly met with significant expectations if it becomes successful. Our dependence on everything from wireless email, online travel reservation systems and hosted software as a service dramatically increases the expectations these technologies will always be available, fast and do everything we want. Examples of failed expectation are everywhere. A few examples. On June, 20th United Airlines canceled 24 flights and delayed another 286 flights due to a “computer gremlin.” Research in Motion recently experienced yet another 24 hour email outage and more than 2.5M users were without service in North America. Salesforce.com, pioneers of Software as a Service (SAAS), a more reliable alternative to running it yourself continue to have outages as well.

Rising expectations, success and dependency force increased complexity in both scope and scale to meet demand. Scope complexity abounds as more and more features and capabilities are added to the services we depend on. I used an example of Citigroup’s internal SOA architecture that has five federated ESBs — one of every technology flavor. Scale complexity occurs as infrastructures grow so large they begin to stress under their own weight. Salesforce.com for example is now processing more than 90M transactions a day through their web interface and AppExchange platform. At a meager 10 messages per transaction that’s almost a billion messages a day going through the infrastructure. Wow. Imagine finding a needle in that haystack.

Finally once popularity rises and the technology become established, accountability arrives. Now we have to worry how safe is the technology and in many cases monitor what people are doing with it. Everyone by now knows of the TJX situation where 45.7M credit and debit card numbers were stolen by hackers that somehow infiltrated its processing systems. The first card numbers were stolen three years ago and still there is no definitive explanation. Everything from cracked WEP keys, software tampered kiosks and insider job have been offered as possible causes. More recently TDAmeritrade and Monster.com have experienced similar breaches of user and account information totaling into the millions. And compliance is everywhere. SOX, PCI, ITIL, HIPAA, FFIEC, FISMA, ISO, CoBIT, COSO and other mandates means IT staff have reduced access and visibility into the systems their trying to manage and keep running.

expectations + complexity + accountability = chaos

I’m interested in your thoughts on the direction this is taking. I’ll be sure to blog more later as the ideas develop.