thebaumblog: Security

Cisco CSIRT Presents at SplunkLive Raleigh

Last Thursday Dave Schwartzburg and a few other Cisco security mavens attended SplunkLive Raleigh. The Cisco Computer Security Investigation Team (CSIRT) has been a applying Splunk to corporate security investigations for more than two years now and Dave was generous enough to share their experiences with us all. Joining Cisco presenting at the event was James Ervin of University of North Carolina Chapel Hill, a very knowledgeable Splunk customer. Patrick Ogden, Splunk Sales Engineer gave a rocking good demo of transaction tracing in a telco provisioning environment and Will Hayes, Splunk Sr. Solution Architect showed the latest Splunk for Cisco Security App being developed together with the Cisco CSIRT team.

Cisco CSIRT Team

Dave Schwartzburg

Dave Schwartzburg is an Information Security Investigator and runs the IDS infrastructure for Cisco Corporate and their internal networks and IT assets. He has an M.S. Information Security from East Carolina University and a B.S from the University of Wisconsin. Dave’s been with the Cisco CSIRT team for two years and prior to that was with AT&T Internet Investigations & Security Services. Cisco has more than 100,000 employees and contractors and more than 127,000 devices on their corporate network. That’s a lot to keep track of which is why the CSIRT team utilizes Splunk.

The Cisco CSIRT works to reduce the risk of loss as a result of security incidents for Cisco-owned businesses. CSIRT regularly engages in proactive threat assessment, mitigation planning, incident trending with analysis, security architecture, incident detection and response. This happens in three phases, investigations, mitigations and prevention.

A Tier 1 Event Analysis Group is located in Costa Rica. They handle security threat monitoring. The Tier 2 Event Analysis Group in Bangalore handles the easier case investigations and mitigations. Dave is part of the Tier 3 Global Incident Response Team handling more difficult cases and longer term prevention through changes to the infrastructure and security systems.

Cisco Security Environment

Cisco regularly collects web proxy (Ironport WSA), anti-virus (Ironport ESA), host-based intrusion protection (Cisco Security Agent), syslog, VPN logs, authentication messages, network IDS signatures and Netflow records from critical subnets.

  • 3 million IDS events per day
  • 3-5 billion Netflow records per day
  • 300 malware-related cases a day

Some event sources send their data to a global network of collection servers and some event types are pulled from their sources directly to a centralized server. Splunk handles the collection and indexing of the data.

Correlation and Reporting with Splunk

The CSIRT team makes extensive use of scheduled reporting and alerting for proactive monitoring of problems.

In this example, the team is correlating host-based IDS with antivirus logs and running malware reports via cron, using the Splunk CLI. The results of the report are scheduled and E-mailed to EA teams for processing and submission for remediation.

“Red Carpet Reports” monitor executive systems to make sure they aren’t infected or compromised. Here we see an example of the Koobface worm found in CSA logs on an executive laptop.

Finally the team has some way to make use of all the CSA data they receive. One of the most useful has been to pinpoint people disabling Cisco Security Agent itself indicating the machine is now unmanaged.

Results for the Security Team

The resulting productivity from centralized access to multiple data sources has been dramatic. Not only is the team lowering the time to respond to incidents, but they are also allowing lower skilled workers to handle more complex cases.. And surprisingly 10% of cases are no from previously unused/underutilized sources. The value of substantially faster access to important data and correlation across numerous sources for reporting and ad-hoc investigations is incredible.

Splunk for Cisco Security App

Some event sources send their data to a global network of collection servers and some event types are pulled from their sources directly to a centralized server. Splunk handles the collection and indexing of the data.

University of North Carolina Chapel Hill

James Ervin

James has been a doing system administration, network and security monitoring and application development with UNC since 1998 when he completed his MS in Computer Science NC State University. As part of the Information Technology Services (ITS) team at UNC his projects have included work on the university’s original Active Directory deployment, Unix-based webmail systems and security and information event monitoring. Earlier this year he inherited a centralized logging project for the university. UNC was the nation’s first state university, serving North Carolina for more than 2 centuries with 29,000 students and 4,000+ Faculty members. ITS is the largest IT organization on campus (~500 employees) looking after financials, admissions, centralized learning and centralized email. ITS frequently collaborates with other campus IT organizations of which there are many.

ITS Environment

The ITS team manages a moderate size mixed application, server and networking environment consisting of the following major components.

  • Multiple Unix flavors (AIX, RHEL, Solaris)
  • Large Windows infrastructure
  • ~600 devices total
  • ~20 IPS/IDS/FW/LB devices
  • PDU, environment probe data
  • Apache, Tomcat, JBoss

This environment is constantly in flux as students and faculty come and go and non-managed desktops, laptops and mobile devices connect to the network.

“We needed to determine what is possible within our environment and adopt a flexible architecture.”
- James Ervin

Earlier this year, James and his team were facing an every growing list of requirements for their centralized log management project including:

  • Make syslog services more useful to the rest of the IT organizations
  • Collect and centralize Windows event logs
  • Alert on events of interest
  • Correlate security events
  • Provide NOC/SOC staff access to security logs
  • Give application developers access to application logs
  • Report on unplanned system changes
  • Satisfy the auditors

Splunk Live Washington DC 2009

Obama-nomics is highly visible in our nation’s capitol these days. The DC economy is humming as our tax dollars are hard at working fueling all kinds of government spending.With more than 100 attendees at Splunk Live on Thursday we certainly were not disappointed in our quest to help make all this growth in government more efficient! Managing large networks and security forensics were the hot topics of conversation at Splunk Live Washington, DC where everyone was treated to a trio of three incredible speakers.

Our first speaker was Andy Purdy, the Co-Director, International Cyber Center, George Mason University and the Former Acting Director, National Cyber Security Division (NCSD) and US-CERT Department of Homeland Security. Andy was a member of the White House staff team that drafted the U.S. National Strategy to Secure Cyberspace (2003) and served on DHS tiger team that formed the National Cyber Security Division (NCSD). He was 3 1/2 years at DHS, the last two heading the NCSD and US-CERT as the “Cyber Czar” of the U.S. Andy is also a Special Government Employee on the Defense Science Board Task Force on Mission Impact of Foreign Influence on DoD Software. He is also a partner with the law firm of Allenbaugh Samini Gosheh, LLP.

The Constantly Changing Threat Landscape

Andy talked with us about the changing threat landscape and lessons learned from past approaches to cyber security that can be applied in a forward looking approach to Risk Management and Compliance.

Since much of his experience has been spent preparing the country for what cyber threats are coming next, Andy thinks of IT security as a war fought in a constantly morphing theater with new technologies and vulnerabilities and new motivations and threats.

A Different Approach Moving Forward

For anyone serious about security this is a sound perspective whether you are a government agency, a major enterprise or a small business. But, the balance between open networks and services and robust security remains one of the major challenges for IT organization. Andy pointed us to lessons learned from his past, fueling a vibrant conversation during the customer and speaker roundtable. Perhaps the most important thing I heard was it’s not enough to prepare for the last war, or the last successful attack. While perimeter defense and legacy standards for network security are provide some measure of security, those measure are very often insufficient to deal with the new threats that seem to be gaining in sophistication at an accelerating pace. Andy encouraged us to focus on adopting new requirements and security infrastructure for situational awareness and control.

Greater sophistication, slower, lower-level attacks, greater knowledge about the targets (data, activity, vulnerabilities) are all contributing to the need for near-time visibility on a large-scale. This has become far more important than sub-second correlation of known attack vectors against discrete sets of network devices.

“NIST perspective: Continuing serious cyber attacks on federal information systems, large and small; targeting key federal operations and assets. Attacks are organized, disciplined, aggressive, and well resourced; many are extremely sophisticated. Adversaries are nation states, terrorist groups, criminals, hackers, and individuals or groups with intentions of compromising federal information systems.”

Andy went on to discuss how the effective deployment of malicious software causing significant exfiltration of sensitive information (including intellectual property) and potential for disruption of critical information systems/services has made detection of inforation and data leakage a key government and enterprise security requirement.

Bob Flores, Former CTO and 31 year veteran of the CIA was our next speaker. Bob retired from the CIA six months ago and is now President and CEO of Applicology, providing cyber security and IT strategy consulting services. In his 31 years at the CIA, he held various positions in the Directorate of Intelligence, Directorate of Support, and the National Clandestine Service. Most recently he was the CIA’s CTO where he was responsible for ensuring that the Agency’s technology investments matched the needs of its many missions. Bob has a Bachelor and Master of Science degrees in Statistics from Virginia Tech.

Quis custodiet ipsos custodes?

Brush up on your Latin! “Who’s guarding the guards” was the topic of Bob’s talk. Insider threat in an every changing threat landscape was and remains our number one cyber security risk.

“Defense-in-depth isn’t just about putting adequate technology in place, it’s also about paying attention to your people and implementing policies and procedures to reduce the likelihood of an insider attack.”
- Dawn Cappell, CERT

The simple but not so obvious model Bob pursued at the CIA was an extension of the ISO stack to include the non-technical but motivational additions.


We need to worry about all levels of the stack including layers eight and nine because we all have people messing around at various layers with applications, scripts, communications etc. And their motivation is often very clear.

Nemo repente fuit turpissimus! Or no one ever became thoroughly bad in one step!”

The point is people don’t just wake up one day and decide to be bad. They are motivated over time by larger causes and in EVERY CASE leave a trail of clues behind that can’t entirely be covered up.

What to Do?

According to Mr. Flores the focus needs to be on real-time visibility. You need visibility into who (or what) is perturbing your enterprise right now and over time. You can tediously review the logs of each device and user as the CIA used to do or you can take advantage of Splunk.

“Splunk may not be the best thing since sliced bread, but it’s pretty darn close.”
- Bob Flores

Why Splunk?

Why did the CIA choose Splunk over so many other security forensic solutions? It all comes down to how easily and scalable Splunk can eat any logs, events and messages Bob’s organization throws at it. Combine that with the real-time search, alert and reporting and over time statistics and analysis on

Splunk Live New York 2009

This week we’re on the East Coast enjoying some fantastic customer presentations and roundtables at Splunk Live events in New York City, Princeton NJ and Washington DC. It’s Tuesday and we have more than 100 customers and Splunk users attending Splunk Live in midtown Manhattan. The vibe is electric as we’re being treated to awesome talks by IDT and New York Life. At lunch, long-term customer’s Bloomberg and AT&T joined the customer roundtable conversation.

Gabe Arnett, Senior Software Architect at Moody’s demonstrated how Splunk is being used to monitor and troubleshoot the Moody’s Analytics platform. Gabe has more than 15 years of building web applications in financial services, investment banking and e-Commerce. At Moody’s he’s responsible for global development team that develops and supports the newly re-designed client facing website – v3.moodys.com. Moody’s is a leading provider of research, data, analytic tools and related services to debt capital markets and credit risk management professionals. The company’s products and services provide the means to assess and manage the credit risk of individual exposures as well as portfolios; price and value holdings of debt instruments; analyze macroeconomic trends; and enhance customers’ risk management skills and practices.

Moody’s Splunk environment is utilized by 25 different users and runs on Windows 2003. Splunk provides Gabe’s developers secure access to the logs they need without touching the production devices, servers and applications. His team has built custom searches and a number of dashboards indicating the general health of their applications and service. Custom searches and alerts provide alerts to track errors and access – guaranteeing good user experience. The team also uses Splunk to understand when and where new content isn’t flowing to the v3 platform. A large part of the Moody’s user experience is delivering email alerts and Splunk helps the team track GUIDs to ensure customers receive the alerts they’ve subscribed to.

The team recently migrated from Splunk 3 to Splunk 4 – taking 30 minutes to perform the upgrade. The Splunk for Windows App has been significantly revamped in Splunk 4 and the Moody’s team is making use of it to monitor through WMI local server resources (disk, memory, networking) and correlate this performance data with the Windows and Application event logs.

Shay Benjamin, CSO and SVP, Architecture at IDTdesigns and implements network architectures and manages compliance, security and fraud initiatives at IDT. IDT Corporation (www.idt.net) is a holding company focused on the telecommunications and energy industries. Since 1995 they’ve been building hundreds of VOIP switches globally and assembling an international fiber optic network. IDT pioneered VOIP (Voice over Internet Protocol) to create Net2Phone, piloted the first commercial WiFi phone service in the US and has created a prepaid calling card business, which sells 12 million calling cards a month.

IDT uses Splunk primary for VOIP Call Detail Records (CDRs). The company indexes more than 120 million CDRs per day with six mirrored Splunk server instances. Call Detail Records (CDRs) are somewhat like logs, but with many fixed delimited fields . One or more CDRs are created at each switching or routing point for every VOIP call. CDRs vary between platform devices in number of fields and contents and unlike logs, few CDR fields contain easy-to-read key=value pairs. Although a key piece of maintaining service quality, billing, monitoring network quality and security forensics, working with CDRs is labor intensive and delay wastes labor, time and money.

IDT needs fast searches across all fields of the CDRs and quick data loading – to allow fast retrieval of call data and cross platform searches to unify results from different CDR formats. Historically IDT utilized a custom RDBMS solution with an application called Call Genius. In their RDBMS IDT was forced to limit the fields that get indexed because indexing of CDRs with an RDBMS is costly as it takes up a lot of space and slows load times. The RDBMS also only indexes fields common to multiple platform’s CDRs. In the RDBMS solution much of the CDR data was put into BLOBs (actually CLOBS) – multiple CDR fields mapped into a single RDBMS field to try and achieve efficiency. But Blobs can be very difficult to search and are difficult to index effectively. The legacy Call Genius application didn’t permit the search of CDR BLOBS.

Now IDT utilizes Splunk to index all CDR fields. No need to decide what fields to index and cross platform searches are easy without losing specific platform CDR format resolution. There is no longer a need to create BLOBs for efficiency. Engineers and support staff are able to quickly search for any combination of

  • Phone Number
  • IP address
  • Trunk Group Name

Splunk naturally and easily links search terms across fields and the users just need to enter the phone number or IP and get back the CDR events and transactions.

Comparing Splunk to the RDBMS solution IDT found searches to be 50 to 100x faster on non-indexed RDBMS data. Indexed fields are also faster in Splunk than in the previous RDMBS solution. Splunk load times for a typical sample average 1 to 5 minutes versus the 20-40 minutes for the RDBMS.

IDT is in the process of feeding firewall, security, router, IP network, and switch data in into Splunk as well. They’re already discovering Splunk is finding errors not captured by Network Management Consoles and has provided valuable troubleshooting during recent datacenter migrations.

Most of all IDT is looking forward to discovering new ways to use all the data in Splunk. Heuristic analysis and Business intelligence applications are on the top of their list including the use of Splunk to find human “Family and Friends” networks and drive the development of new commercial programs.

Splunk 4 Lands in the Southwest

Last week we continued our road show launching Splunk 4 through the Southwestern US in Phoenix, San Diego and Los Angeles.This was our second annual gathering of customers, partners and users and we had more than double the attendees at this year’s Splunk Live events. In the morning we held a three-hour hands on technical workshop. Attendees had the opportunity to install and configure Splunk 4 on their laptops or remote server and get one-on-one assistance from the Splunk team. Afternoon sessions and dinner focused on customer presentations. We’re very grateful to all the presenters who took time out of their busy days to share with everyone how Splunk is transforming their IT environments. I captured some notes from the week and thought I’d share them with you.

Early Warning

In Phoenix we had a packed house at the Sanctuary conference center on the side of Camel Back Mountain. At 109 degrees I decided against hiking up it in the early AM. Dave Bridgeman, Data Security Engineer at Early Warning kept things cool showing the audience how his company’s use of Splunk in their security operations center. Early Warning collaborates with major financial services companies to facilitate fraud detection through shared information and knowledge in cross-institution environments. The company has an interesting history having spun out of First Data and is now primarily owned by Bank of America, BB&T, JPMorgan Chase and Wells Fargo.

Dave is a well rounded IT professional who started as a developer then moved into network and security management. He current leads the data security team for Early Warning. The environment he over sees includes a variety of platforms including AS400s, MP300s, AIX, Solaris, Linux and Windows. He uses a combination of Splunk forwarders and syslog forwarders to collect Java and Cobol application logs and FTP/SFTP networking logs.

The Early Warning Splunk installation is designed to track transactions and users from one bank to the next in cross-institution activities. Transaction ID tracing correlates events across applications and services and Splunk alerts the team when jobs fail so the operations and development teams can securely troubleshoot issues on the fly. And remote accessibility mean no more driving into the office to access locked down servers in the middle of the night. On the security side of things Splunk helps Dave’s team track and monitor known fraudsters and bad user names allowing them to stay vigilant when monitoring external attacks. They also use Splunk to deliver reports for customers, executive committee members and the Security Advisory Committee (with representatives from the founding banks).

Amkor

Henry Grant of Amkor a $2.1B provider of packaging/assembly and testing services for the semiconductor industry also presented an overview of how his Corporate Data Center team uses Splunk. Henry overseas operations for the company’s SAP, PLM, Supply Chain, Hyperion and Oracle systems. Amkor has a heterogeneous environment of Sun Solaris, IBM iSeries, Cisco ASA firewalls, packaged and custom web and J2EE applications and TACAS/Radius accounting and access control technologies. With manufacturing locations in China, Japan, Korea, Taiwan, Singapore and The Philippines and headquarters in Chandler, AZ, the Amkor team is challenged with log and event data overload. GBs of data a day generated at multiple points makes operational troubleshooting and security investigations extremely complex.

SOX Compliance

Proving SOX compliance has traditionally been handled by writing and maintaining scripts to collect and report on errors, access controls and log access activities. It was impossible to segregate duties given the lack of access control to the logs and events themselves. Splunk has taken the place of the awkward script writing and maintenance to collect iSeries, Unix and application events and logs and provide automated schedule reports. The team is now expanding the Splunk footprint to handle network and Oracle logs as well.

Application and System Monitoring

Like most enterprise IT shops, Amkor has figured out that traditional point monitoring tools aren’t enough as they have a hard time scaling to all the modern day technologies, require intrusive agents and only work for known events but don’t handle anomalies and unknowns. Too many issues end up being reported by end users themselves rather than the monitoring systems. With Splunk Henry’s team detects event anomalies in real time and has dramatically cut their response time by hours per incident.

Tools for the Help Desk

Sometimes it’s the simple things that can cut your response time, escalations and IT budget. The Amkor team noticed a lot of calls and emails regarding VPN set-up and access across the company. With Splunk level 1 help desk agents are now able to resolve most of the VPN issues without creating an escalation. Henry’s team built a VPN dashboard driven by a series of searches and reports that gives entry level help desk personnel the insight they need to troubleshoot problems right away.

Henry’s Splunk Tips

The best part of Henry’s overview were the tips for a successful Splunk implementation. I’ve included the list here in hopes that these may help you as well.

  • Provide training that caters to each group’s need.
  • Utilize the deployment Server.
  • Develop a Common Information Model.
  • Update and change as needed.
  • Use Tagging to Normalize Data.
  • Monitor Scheduled Compliance Reports by using the Audit Logs.
  • Splunk into your processes where possible.
  • Setup Test/Dev Environment and a Test/Dev Index .

Intuit Consumer Group

The Intuit team of Jeff Ludwig, Chief Architect and Larry Raab, Architect of the Consumer Group joined us to share how use Splunk in production support operations. Jeff leads the Consumer Group’s Connected Services Development for electronic and print tax and payroll filings for TurboTax, ProSeries, Lacerte and QuickBooks. Larry speciali a large-scale, highly available application and systems architect responsible for the consumer group applications and infrastructure.

While the original use for Splunk at Intuit was application management, Jeff and Larry covered three additional ways they have applied Splunk including reliable monitoring, improving user experience and large-scale reporting for compliance and business intelligence.

Splunk Live London - Awesome

I’m finally getting my head above water after a tireless run up to and hectic week launching Splunk 4. The highlight of the launch for me was Splunk Live London. IMHO Splunk Live London 2009 was unrivaled as the most outstanding Splunk event yet.
We came up with this idea of getting local customers together as a way to launch Splunk 2 in June 2007. Five of us Splunkers sprinted between eight different cities in two weeks to share what was new and encourage users to exchange stories of how searching their data centers was changing life for the better. Its an exhausting way to launch a new product, but it worked so well we’ve integrated Splunk Live events into the mainstream way we do business and interact with our community. I’ve long since lost count of the number of Splunk Lives we’ve conducted all over the world including places like Cape Town, Johannesburg, Beijing, Tokyo, Singapore, Bangkok, Sao Paulo and yes once again in London.



This year’s London Splunk Live was really special. The event occurred during our launch of Splunk 4 and surpassed our expectations as the largest event we’ve ever held. More than 100 customers and users attended at the Cumberland Hotel and their swank conference facility, complete with a business canteen like breakfast experience, near Marble Arch in West London.

But the dominant reason to attend any Splunk Live are the presentations and round tables with forward thinking IT professionals who are using Splunk to transform the way they manage IT. This year we were very fortunate to have three Splunk customers who took time out of their busy schedules to come to London and share their experiences with us.

Accenture - Alexander Strobl, Technical Consultant

Alexander has been a visionary inside Accenture bringing the power of IT Search to enterprise clients in Germany where he works for Accenture as a Technical Consultant in the Data Center Technology and Opeations team. Alexander is responsible for analysis, design, roll out of Splunk. His most recent Splunk project was with a large worldwide services company with more than 50,000 employees on three continents operating mail order, distribution, e-commerce and over-the-counter-retail trade. Accenture implemented Splunk to transform the management of several technologies including Linux, virtualization and large-scale storage systems.

The project was part of an IT project to reduce the time to triage problems and improve quality of service. Challenges were:

  • no centralized access to logs and events,
  • critical IT data was stored on local file systems which were copied to central storage only once a day,
  • manual processes to locate errors,
  • no correlation between events on different services/servers and
  • development time was spend building workarounds rather than working on revenue generating applications.

All of this resulted in complex and time consuming analysis and end the end long MTTR.

The Accenture Splunk installation is currently indexing ~50GB/day including custom application files and events from 10+ integrated business critical applications and services. There are two Splunk indexes; one for testing and one for production environments and the team has established interfaces between Splunk and several other legacy data center tools.

Telenor - Henrik Strøm, Security Architect

Telenor is Norway’s largest ISP, Mobile Operator and Telco. Its one of the largest mobile operators in the world, with 160+ million customers and was founded in 1855 - 154 years ago. The company has 13.000 employees in Norway and 26.000 abroad. Telenor has been rolling Splunk out for centralized log collection and management using Syslog to forward data where it is already in place and using Splunk as a forwarder for new systems and systems with complex multi-line and/or XML structures Syslog can’t handle. Sources of data handles by Splunk include:

  • application logs (Web, Email, IPTV)
  • data center logs (server, network, storage and firewall)
  • IP backbone logs

Use cases include what Henrik refers to as digging, dashboards baselines, alerting and reporting. One of the best “digging” examples Henrik mentioned was identifying Unix Kernel Errors over the last 30 days. This kind of information routinely went unnoticed prior to Splunk’s arrival.

Another powerful use case explained by Henrik was how to baseline what is normal in your environment. For example, how many errors do you have on average for a particular type of device (routers, servers, specific applications, etc). Splunk was used to baseline normal Linux kernel behavior and found roughly 20 kernel errors per Linux running instance every 15 minutes.

The base line then allows the team to schedule simple searches to look for deviation from the baseline and send out alerts before downtime occurs from these hidden sways in behavior. In one case Splunk found thousands of errors occurring on a specific type of device, where the normal baseline was around 20!

The Telenor team also uses Splunk to identify and report on security situations that may impact their customer facing network and services. Because they are able to easily compose dashboards showing for example which Web servers are under attack and who is attacking them all in one place, the team saves Telenor from potential downtime, performance degradation or theft of data due to attacks they’ve not seen before and are missed by existing security policies and technologies.

Vodafone - Paulo de Carvalho, Network Services Manager

Paulo de Carvalho has been using Splunk at Vodafone for almost two years now. His presentation titled “Freeing Information from Organizational Silos” lifted the idea of leveraging logs and IT data out of the realm of just system administration into a thirst for higher level intelligence that crosses not only IT but also business functions. Paulo started by describing the current service oriented architecture (SOA) at Vodafone and how attempts to objectize and re-use capabilities creates incredible complexity among the services, technologies, processes, tools and people.

The Great Firewall of China: Internet Censorship Run Wild

The past couple of days I’ve been visiting China meeting with some of our technology and channel partners. It just so happens I was present in Beijing for the 20th anniversary of the 1989 Tiananmen Square Events. Yes it really did happen despite what the Chinese government says. Speaking on Saturday at the F5 APAC Sales Kickoff I found myself staying over the weekend with Sunday off to roam around Beijing like a tourist, something I rarely get a chance to do on business trips. It is amazing to me to see how the Chinese and Taiwanese work on Saturdays. In the US we rarely see that. Europeans chastise Americans for working too hard but I guess they should really see the work ethic in Asia and then we’d look more normal.

Watching the 2008 Beijing Olympics last summer things there certainly seemed more normal than 20 years ago, but being there in person with all the festivities gone things seemed really strange to me. It is very difficult to describe. Maybe I was jaded by all the newspapers I’d read on the way to Beijing. On a nice long 13 hour flight from Washington DC with plenty of reading material I consumed James Kynge’s piece in the Financial Times questioning whether the Western media really understood why the student demonstrators were protesting. He went on ascribing the word “democracy” with the student motivations and questioning whether we or they really knew what it meant despite the fact that he spells out their desires in plan old English which sounds like democracy to me.

“Almost everything fell within its scope: campaigns against corruption, nepotism, inflation, police brutality, bureaucracy, official privilege, media censorship, human rights abuses, cramped student dormitories and the smothering of democratic urges. But to say the demonstrations were to “demand democracy” is an oversimplification.”
James Kynge, Financial Times

It’s almost impossible to describe the strange feeling I got while walking through Tiananmen Square observing the soldiers and the huge portrait of General Mao that dominates the landscape. Maybe part of it was due to the increased tension of the anniversary. Maybe not. Tiananmen has come to symbolize the unspoken and largely unrecognized tension between the economic progress driving modern China and the old fashion communist government still ruling there. The Chinese seem to have a foot in both camps. The eeriness I felt came not only from my surroundings and an understanding of the principles they stood for but also from the reaction of my Chinese and Taiwanese friends. Their usually jubilant outgoing personalities were completely subdued in the square. Was a sign of respect and mourning that drove their thoughts? Perhaps to some extent. But in quiet whispers and conversations out of the ear shot of any “green” uniformed soldiers (versus the “blue uniformed” security guards they confessed to being actually scared to speak for fear of someone or something listening. Challenging them I said, “surely you must be joking.” But it was no joke. Only when we crossed the street into the forbidden city did their usual personalities return.

Of course this began a prolonged conversation over the next 24 hours as we visited the great wall, a new Beijing restaurant and departed through the impressive new Beijing airport. I kept asking and trying to understand. How can a country of so many people be controlled by the minds of so few? What are the real limitations to speak out? And what effect will economic progress have on the political future of China? There was no shortage of stories supporting the fact that the government still does take a very heavy hand to those who disagree. But rather than discuss it, everyday Beijing seems to sweep the event of 20 years ago under the rug. As one of my Chinese friends said, “everyone is embarrassed and we just pretend it never happened.”

At the same time I was traveling through out China, the articles started pouring in about Beijing’s efforts to step up Internet and IT censorship. Upon reading the perspectives pouring in about “Green Dam” I was reminded of the impact the technology industry is having on the whole situation. It was bad enough I couldn’t get to sites like Twitter and Youtube form my hotel room. Now the Chinese government is requiring every PC sold in the country starting July 1st has to have special software blocking all sorts of things. The move is being presented as an attempt to protect children from online pornography but is obviously one more attempt by Beijing take its censorship to a new level. China currently has the world’s most sophisticated and multi-layered system of Internet censorship. Objectionable content on domestic Web sites is deleted or prevented from being published, and access to a large number of overseas Web sites is blocked or “filtered.” Decisions about what to censor are based on the Chinese government’s attempts to control the minds of 1.2B Chinese. There is no transparency or accountability, no public consultation in developing block lists or censorship criteria, and no way to appeal the blockage or removal of Web content.

In a notice to PC makers, the Ministry of Industry and Information Technology said all PCs shipped in China needed to offer Green Dam/Youth Escort, identified as a “green internet filtering software”, either pre-installed or as part of basic software packages. In May 2008, the government picked Jinhui Technology and Dazheng Language Technology, two Chinese software companies to develop the software, according to a contract award notice from the MIIT. While these companies claim their software is only being used to block sites although last year, researchers discovered that a Chinese version of Skype contained the ability to block politically sensitive words in instant messaging chats, and to keep a record of the use of such words.

Conficker is Proof We Need to Log Broadly and Analyze Deeply

At RSA this week it’s easy to got lost in the menagerie of security technologies to conquer malware proliferation, stomp out spam and protect virtualized and cloud computing environments. But the most recent statistics show we are still losing the war on cybercrime. Symantec’s latest Internet Security Threat Report sited 1,656,227 malicious-code threats last year and 75,158 new active bot-infected computers per day. And yes the United States is still the most frequently targeted by denial-of-service attacks accounting for 51% worldwide and the top country for underground economy servers advertising stolen credit cards accounting for 67% of all activity worldwide.

Why are we losing so badly? Not surprisingly, there was a lot of talk at RSA about the Conficker worm. Some of the chatter points to reasons why the security industry is falling behind. At first glance, the Conficker worm looks harmless. So far there are not too many significant reports of infected machines and hijacked data,
but it may be too early to feel so smug about it. The worm’s real danger is its demonstrated ability to evade the expensive IDS technology enterprises have put into place and rely on today. Estimates are that 90% of the enterprise IDS implementations have failed to detect the worm’s presence and create some kind of actionable alert. How can this be?

Conficker properties are simple but different from the typical threat. First Conficker affected systems outside of IDS coverage like USB keys and mobile user laptops. So if you’re looking for attacks from outside your network only, you won’t see it. It’s a “walk-in virus”. Second it isn’t greedy like Code Red and other viruses of late. The Conficker worm has built-in sleep cycles. So where a typical worm might scan 1,000 or 10,000 IPs a minute, Conficker was happy to scan maybe say 100 and evade the baseline trip wires. Third Conficker is very selective with its payload delivery. It only delivers when it sees a vulnerability. All this helps Conficker evade IDS systems that want to witness the crime. But Conficker is the perfect crime in that it goes undetected. With no payload delivered and seemingly fewer IPs scanned there is no grossly abnormal behavior to witness. The evidence is circumstantial.

At a lunch on Wednesday, Tom Le of BT gave a good overview of how BT Managed Security Services detected Conficker for their customers. It was one of the first times I’ve really been sold on a managed security service beyond the value of cost and convenience.

First, as Tom explained it, they started by assuming IDS would miss the attack. They didn’t assume a payload had to be delivered and didn’t assume that large number of scans were needed to indicate the presence of an intruder. Instead of depending on IDS, BT uses logs and events to baseline the natural behavior of even netbios triggered scans (which Conficker happened to use) and was able to alert on small changes in scans that would be missed if you were only looking at things like netflow. As it turns out most firewalls blocked the netbios scans going out so again most customers didn’t even know they had the Conficker worm present.

Second Tom and his team assumed some type of command and control activity associated with Conficker. They followed the money watching for things like confikur trying to phone home in different ways. By having a broad set of logs and events from switches, routers, applications and IDS they were able to look for outlying behaviors like DNS lookups to obscure locations not typically seen in customer networks and aggregate this information across customers to identify common abnormalities. Tom estimates that BT sees roughly five billion messages a week across their customer base. That’s a lot of messages.

After listening to all the chatter about Conficker and walking the show floor, it gets easier to understand how criminals continue to evade the security infrastructure enterprises put in place. There are just too many ways in which breaches can occur and there is just too much data scattered about to collect and correlate in order to find the anomalies. So the security industry continues down the path of specific solutions to specific vulnerabilities and criminals continue to create new threats that evade the industry’s point approaches. I say the industry as a whole needs to move to more of an adaptable and flexible approach that can apply security to what ever threats arise, when they appear.

The best real world detectives are able to piece together seemingly circumstantial evidence and sift out the clues that lead to catching criminals. But every time it’s different. Perhaps we need to take the same approach in order to obtain more adaptable security solutions. Assume every time it’s different not the same.

Logging broadly and analyzing deeply is one of the best defenses. Without a broad swath of data you won’t have the pieces of the puzzle to put together at the moment you need to solve the crime.

Few criminals are caught in the act.

Life after SIEM. Situational Awareness is next.

We’ve been hearing a lot lately about the death of SIEM technologies. But isn’t the question less about a legacy technology dying and more about the dimensions on which the next mass adopted security capability will be born? Clayton Christensen first described a model for disruptive technology in his book The Innovator’s Dilemma and his follow on The Innovator’s Solution. Christensen describes a theory about how disruptive technologies over take sustaining technologies by delivering value on new dimensions that established vendors overlook as unimportant, low end or just don’t think about because they’re too busy improving their legacy. Christensen’s work offers an interest framework to think about what’s taking place in the market for SIEM security management solutions.

Any enterprise trying to secure their IT infrastructures knows the state of the art in SIEM security approaches falls short. And trends like virtualization are making things even more difficult. System and security administrators and analysts are inundated with too many potential incidents and its too difficult and time consuming to investigate even a fraction of them. Achieving a greater comprehension of the meaning of potential incidents and the projection of their status in the near future is the real goal. The idea, called “situational awareness” is often, however, impossible to achieve. We are so dependent on pre-programed rules in our SIEM solutions that we lack the ability to perform our own analysis because the original raw data has been filtered out, thrown away or we have no practical way to make sense of it.

Observation: If the technology is sufficiently complex as to allow the vulnerability to exist, can we really build complex technology to catch all the possible issues or scenarios?

As a reference point see David Hazekamp, Security Architect at Motorola, talk about the importance of retaining all security data across the Motorola global SOC infrastructure and integrating access to all this data into existing SIEM solutions.

Of course reaching this understanding requires one suspends their disbelief about the effectiveness of current SIEM security technologies. Usually this means you’re not a vendor or you’re a vendor with little or no vested interest in current approaches. So with this let’s examine the typical enterprise deployment of security technologies.

Defense in Depth

This is where every good enterprise security architecture starts. In order to begin securing your environment you’ve got to have data, raw data. In most data centers this takes the form of syslog from network devices and servers, SNMP traps, OPSEC or LEA interfaces for firewall events, WMI for Windows desktop and server events, IDS and IPS signature scans and application level firewall examination of common services like FTP, HTTP, SFTP, SCP etc. The thinking is you need to look at everything. Perhaps you’ll even want to pull in information from physical security systems like badge readers.

Security Information Management (SIM)

The next step in the process is to manage all this raw data and filter it down to a manageable number of events, traps and alerts. Collecting, storing and providing some basic analysis on all this data is the job of a SIM. Typically, as Raffy points out, the data is parsed, normalized and stored in a structured RDBMS. Parsing, normalizing and structuring all this data is great if the data doesn’t change or you don’t have too much of it. But if you’re dealing with data formats that aren’t static or you’re trying to store terabytes of this data an RDBMS won’t be your friend.

Security Event Management (SEM)

Once a SIM has done it’s job you’re ready to aggregate, correlate and start reporting on potential incidents using a SEM to do the job. SEM’s usually consist of lots of rules that look for combination and patterns of events indicating that a possible attack or breach may be underway. Essentially the SEM rules attempt to codify what we humans know about vulnerabilities in our IT systems and possible ways to exploit them. The goal is to provide some real-time information usually in the form of reports, dashboards and visualizations to operations and security analysts who work to keep the infrastructure secure.

Situational Awareness (SA)

SIEM correlation can be interesting for discovering a pattern or related event but the ability to work an issue outside of these “canned” rules and events becomes the real problem. Unfortunately, what all to often happens is there are so many possible attacks, operations and security staff are overwhelmed with potential incidents to investigate and not every event or pattern of interest is going to be discovered via the pre-built rules. Situational awareness is the attempt to perceive environmental elements within a volume of space and time. Comprehension cannot be achieved if the data being bubbled up is filtered according to a set of rules and the technology does not allow a human to perform their own analysis of the raw data as generated by the environment itself. All technologies have their weaknesses and those that perform correlation are no different.

Thus whilst canned SIEM correlation provides value in bubbling things up — we still need the ability to dig into the raw data to fully perceive and comprehend what is taking place. Now mind us all SA is not a new concept. It has been applied rather robustly by decision-makers in complex, dynamic areas from aviation, air traffic control, power plant operations, military command and control — to more ordinary but nevertheless complex tasks such as driving an automobile or motorcycle. And yes it has been mentioned before in security operations, particularly in government agencies.

Splunk Live Southwest 2008

This week we’ve been moseying through the Southwestern part of the US with our Splunk Live show. We changed up the format a bit with Splunk technical workshops in the morning and customer round tables in the afternoon. The technical workshops were a big hit with more than 200 people registered to engage with our Splunk Experts. During the workshop you were able to download, install, configure and start using Splunk on your laptop or server with remote access. The best part about Splunk Live events though is sharing ideas with other Splunk fanatics.

Ryan Peterson from Infusionsoft, a marketing automation company, gave a great talk in Scottsdale about his Splunk deployment for the company’s email infrastructure. Ryan is tasked with keeping more than 12M emails a week flowing out of the system to support Infusionsoft’s Automated Follow-up Technology (AFT). Ryan has multiple servers in different geographies in addition to PCI Compliance requirements. He demonstrated using Splunk to troubleshoot problems spread across the messaging infrastructure, address reporting inaccuracies and deliver PCI reports to auditors. He’s even indexing the content of email with Splunk using a scripted LDAP data input. Cool stuff.

In San Diego Tony Doan of the Genomics Institute at the Novartis Research Foundation (GNF) and Eric Van Johnson from Sony Consumer Electronics joined us. Tony is a security engineer and former pen tester. He also confesses to be a recovering Unix sysadmin. GNF has 600 Windows desktops and several hundred Windows and Linux servers supporting the discovery of new biological processes and improved human therapeutics. Tony discussed how they splunk Cisco CSC, Bluecoat, Symantec AV, Arpwatch, Cisco Switches and Wifi access points to find what he calls “previously unknowns” to improve operational availability and security. He says they’re finding new uses everyday but Tony’s favorite is splunking Cisco IPS and Cisco MARS events looking for odd behaviors. Next up for GNF is eating Windows Event Logs and Windows Registry inputs together with summary indexing for consolidated reporting.

Eric Van Johnson is the eServices Hosting and Operations Manager at Sony Consumer electronics. He led an great discussion on splunking IBM Websphere and MQ Series events including how Sony has integrated operations and development environments to identify problems with complex apps more quickly and avoid unnecessary escalations to the development team. He shared with us Sony’s roll out of Splunk to their Business Intelligence Group. The idea is to complement aggregated WebMethods data reporting for business activity monitoring. Next up he wants to feed Splunk data back and forth with Verizon’s hosting operations since some of the Sony servers are hosted at Verizon and Verizon is also using Splunk.

In LA Rich Horace, Director of Systems Engineering and Operations at Fox Interactive Media demonstrated how Fox uses Splunk in the Fox Audience Network. Basically these are the guys that serve web advertisements across all the Fox properties including MySpace, Rotten Tomatoes, Fox Sports and IGN. He’s challenged with launching new monetization platforms and keeping the existing ones running. Rich gave a fantastic overview of his Splunk installation which consolidates/aggregates data form disparate systems in order to protect against hackers and meet PCI and SOX requirements. He currently runs an environment with ~600 Linux servers, load balancers, servers, NetApps and network switches. So far he’s indexed 1.5B events. We engaged with everyone in a lively discussion about securing production sites from developers and controlling and auditing access to data using Splunk’s access controls and search filters. Rich also discussed how Fox is using Splunk to integrate with various Citrix products including Netscaler and XenApp.

Thanks to everyone who shared their stories with us this week, it was really awesome.

Ode to Log Management

I love “log management.” I hate log management.

I love log management because years ago it was the impetus for IT to move beyond simple SNMP monitoring to collecting and trying to understand a much richer set of data about complex environments.

I hate log management for over the years it has been co-opted by vendors and analysts who’ve pigeon holed it into yet another IT management silo. These vendors and analysts have narrowly defined log management as the collection and storage of logs in some locked repository used to generate static reports to satisfy regulators, auditors and IT governance boards.

Why am I so bitter?

First it turns out logs are critical to many other stakeholders in the enterprise. Operations needs real time access to logs in order to find and fix problems and improve mean time to recovery (MTTR). Security needs logs to catch bad guys. Business people need logs to understand customer and service behavior and provide service level measurements. So locking up logs in a static repository designed for one constituency severely limits their value and diminishes the return on investment not only in a log management solution but also the return on your IT assets overall.

Secondly logs alone don’t provide anyone of the IT stakeholders with a complete picture.

Let’s take a simple example right from the hottest compliance use case today — PCI. The Payment Card Industry (PCI) Security Standards Council founded by American Express, Discover Financial Services, JCB International, Mastercard and Visa has outlined requirements for security management, policies, procedures, network architecture and software design. If you are a merchant accepting credit or debit cards and you process more than 20,000 transactions per year there are twelve specific requirements. Failure to comply with the requirements is not an option. You can be fined heavily and you can lose your ability to accept credit and debit cards.

One of the twelve requirements is the commitment to monitoring and investigating changes to configuration and password files for any application, server or device involved in the processing of card holder information and transactions. In the case of file content, permissions or attribute changes, logs will only tell me part of the story. Yes a Windows, Linux or Unix log will tell me a file has been changed but it won’t tell me who changed it. It also won’t tell me if the change was authorized or not. To understand who changed a file I need to look at the other user processes running on that server at the same time the file was changed. What user processes were running and who owned them? In Unix or Linux this information is easily viewed with a simple “ps” or “top” command but doesn’t exist in any log. In order to understand if the change was authorized or not I need to compare the log and file change information with the user information and any tickets from the service desk authorizing this user to make this type of modification.

The real reason I believe we need to move on from talking about log management is log management isn’t a market. It isn’t a solution. It is a feature in a much broader landscape of harnessing all the data being generated by our IT infrastructures.

Turning all that data info information for every stakeholder is important to the future of IT as environments grow more complex, dynamic, service oriented, virtualized and mission critical. Not just to report on compliance controls, but to improve our speed of root cause analysis, increase our ability to quickly and comprehensively investigate security attacks and develop more intimate relationships with our customers by better understand their behavior and providing a transparent view of the services they are receiving in return.

Splunk and US Federal Government Agencies

foselogo_large.png This week we’re at FOSE 2008 demonstrating how we’re collaborating with US Federal Agencies. A number of agencies have already joined the Splunk community including:
  • Executive Office of the President
  • Federal Bureau of Investigation
  • NASA
  • Social Security Administration
  • US Department of Agriculture
  • US Department of Defense
  • US Department of Energy
  • US Department of Homeland Security
  • US Department of Interior
  • US Department of Justice
  • US Department of Labor
  • US Navy
  • US Department of State
  • US Department of Transportation

Many of these customers are applying Splunk to extreme applications with large data volumes from many different disparate sources. As you can imagine the complexity of security and compliance concerns, agency interactions and a sophisticated web of outsourcing to federal system integrators provides fertile ground for IT Search as a new way of solving all kinds of problems.

Typically our collaboration involves operations, security and compliance people from both the agency and system integrator sides. Agencies continue with their pursuit to cut costs and outsource while being driven with a host of new projects every year. And system integrators continue to search for new ways to bid more competitively by demonstrating new ways to more efficiently develop, deploy and manage technology. This means the business of managing our nations IT infrastructure is significantly more complex and dynamic than ever.

As an example, the current state of the world demands a serious risk management approach to Federal Government systems. All agencies have implemented some type of security in-depth strategy with firewalls, vulnerability and IDS scans. While these technologies are effective in their particular function they generate a tremendous amount of data making it impossible to get a holistic view. These extreme customer environments generate more data and are more dynamic that traditional system and security management approaches can handle. Traditional database and SEIM approaches just don’t scale.

Our own Bill Hornish, who attempted for decades to implement these traditional approaches at several large agencies has put together a really nice video explaining the challenges of risk management in Federal environments and how Splunk can help.

We’re learning a lot by working with these extreme customers and believe they can teach us a lot about what the rest of the Splunk community will eventually experience when applying IT Search to larger, more dynamic environments in the commercial sector as well.

The Splunk Platform Has Launched

Without a doubt the past week has been the most amazing week in Splunk history. The crazy coast to coast multi-city launch left us all exhausted and electrified. A few of the things that stick in my mind…

First Splunk 3.2 including Splunk for Windows went live on our download page last Saturday and more than 40% of our downloads in the past week have been for our new Windows version. Then Nick Selby of 451 Group wrote an analyst brief on us. He said, “Splunk is awesome: it’s multiplatform, easy to install and easy to use. And with an abstraction layer of logs, configuration files and system messages, traps and alerts, it’s seriously useful.” 451 has a reputation for ripping vendors, so we’re flattered.

Dana Gardner, analyst with Interarbor wrote a very eloquent analysis of our platform launch on ZD Net. “Splunk has created the means to offer developers easy access to that data and the powerful inferences gleaned from comprehensive IT search. That means the data can go places no log file has gone before,” says Dana. Developers are certainly doing some way cool things with Splunk.

I’ve seen a couple of neat visualization applications including this one called Replay. It shows you a live or time lapsed view of your event streams. Here you can see the replay application hooked up to our internal wiki showing who’s doing what over a 24 hour period. Click on the image for the movie.

replay.png

As for our own applications, the Splunk for PCI app drew tremendous interest at our series of Splunk Live events this past week. It’s just one example of how a business person with domain knowledge can package their own Splunk configuration as an application. If you haven’t seen Raffy’s video on the PCI Application, check it out here.

pci.png

We also showed the Splunk for Change Management application as well. Seeing someone touch a file and watching the Splunk dashboard update instantaneously is an awesome display of how flexible Splunk has become. Check out the developer program for yourself and get your goods up on SplunkBase so we can all check em out.

changemgmt.png

Interop NYC 2007

Last week I was in NYC for Interop 2007. Interop in NY is a significantly smaller conference than the big brother Interop in Vegas. I’d say there were 7,500 to 8,000 people at Interop NYC this year, compared to 18,500 in Vegas back in May. Somehow though I always find the New York show more interesting. Perhaps it’s the lack of constant firefighting in the NOC that gives us all more time to have meaningful conversations about the latest networking technologies. Plus somehow New York just seems to have more substance than Vegas. Call me crazy but…

This was also the first Interop where we had a chance to apply the magic of Splunk genre 3.0. We had a record number of searches in the NOC (despite the smaller show). I’m not surprised. 3.0 is so cool the way it automatically extracts fields out of data streams from all kinds of networking gear.

Now there are lots of people who know more about networking and security than I do, but here’s a simple investigation I did with Splunk.

1. I started with a simple search for “failed password.” This picks up firewall and router hacking attempts (typically ssh) sent to Splunk using syslog forwarding.

2. I was then able to quickly see the top “source IP”. Because the source IP field automatically gets extracted with each search I’m able to quickly click and see the list of top source IPs for the time frame in question. A single click and I’ve added the top offender to my search parameters.

3. Just a click away and I can geolocate this IP. With field actions in Splunk I can now drive workflow items right from the search results. Here I just need to click on the menu next to any IP address and I can geolocate the address with any number of free web based services. It was interesting to watch the hackers and bots travel around the world and with more time would have been fun to write a little Flash application to call the Splunk API and map things in real-time.

4. Reporting on top source_IPs every hour was easy. Like any IT guy without a bunch of time, I went for the low road. I just clicked report on all source_IPs from the field action menu and I got a nice looking flash report. It was really easy to save the report and run it on a schedule every hour. Now anyone on the NOC team alert list can get it right in their email or log into Splunk and check out the dashboard with a few other useful security searches.

null

You can split the same report series by user and see how a lot of these hacker bots try to use common software package and open source default configuration usernames and passwords.

If you want to check it out yourself, send me mail and I’ll let you know where you can access the server. It’s kinda fun to search on your own machine name and see all the times you were on the network at the show. You can drill down into each DHCP transaction and see all the events.

Chaos & Insanity

computerworld.jpg

Last week Splunk sponsored ComputerWorld’s Infrastructure World conference along with HP and IBM. I needed to come up with a talk and I wanted to do something new.

I’ve been thinking about how to describe the challenges we have managing all this changing technology and innovation. Note this is seriously a work in progress. I’m developing a theory that there are three fundamental drivers to data center chaos.

  • expectations,
  • complexity and
  • accountability

Any new business or consumer technology can be quickly met with significant expectations if it becomes successful. Our dependence on everything from wireless email, online travel reservation systems and hosted software as a service dramatically increases the expectations these technologies will always be available, fast and do everything we want. Examples of failed expectation are everywhere. A few examples. On June, 20th United Airlines canceled 24 flights and delayed another 286 flights due to a “computer gremlin.” Research in Motion recently experienced yet another 24 hour email outage and more than 2.5M users were without service in North America. Salesforce.com, pioneers of Software as a Service (SAAS), a more reliable alternative to running it yourself continue to have outages as well.

Rising expectations, success and dependency force increased complexity in both scope and scale to meet demand. Scope complexity abounds as more and more features and capabilities are added to the services we depend on. I used an example of Citigroup’s internal SOA architecture that has five federated ESBs — one of every technology flavor. Scale complexity occurs as infrastructures grow so large they begin to stress under their own weight. Salesforce.com for example is now processing more than 90M transactions a day through their web interface and AppExchange platform. At a meager 10 messages per transaction that’s almost a billion messages a day going through the infrastructure. Wow. Imagine finding a needle in that haystack.

Finally once popularity rises and the technology become established, accountability arrives. Now we have to worry how safe is the technology and in many cases monitor what people are doing with it. Everyone by now knows of the TJX situation where 45.7M credit and debit card numbers were stolen by hackers that somehow infiltrated its processing systems. The first card numbers were stolen three years ago and still there is no definitive explanation. Everything from cracked WEP keys, software tampered kiosks and insider job have been offered as possible causes. More recently TDAmeritrade and Monster.com have experienced similar breaches of user and account information totaling into the millions. And compliance is everywhere. SOX, PCI, ITIL, HIPAA, FFIEC, FISMA, ISO, CoBIT, COSO and other mandates means IT staff have reduced access and visibility into the systems their trying to manage and keep running.

expectations + complexity + accountability = chaos

I’m interested in your thoughts on the direction this is taking. I’ll be sure to blog more later as the ideas develop.

Welcome!

I’m Michael Baum. Welcome to my blog.

I hope to find time to write about some of my favorite topics including:

  • Splunk and IT Search.
  • Technology gadgets and software — the stuff we all like to use.
  • Datacenter applications, servers, networks and security — the stuff we all have to keep running.
  • Business, entrepreneurship and venture capital.
  • Wall street and investing.

Comments are always welcome and you can also reach me via email at thebaum (at) splunk (dot) com.