Raffy: Archive for December, 2007

Common Event Expression

cee-logo.gif Common Event Expression (CEE) standardizes the way computer events are described, logged, and exchanged. It is an effort hosted by Mitre, as so many other computer security standards like CVE, or OVAL. The CEE effort is subdivided into four sub-efforts. Each of them will publish their own set of requirements to guarantee seamless future interoperability of devices and applications:

  • Event Syntax
  • Event Taxonomy
  • Event Transport
  • Event Logging Recommendations

The order in which I listed these efforts is most likely the order in which CEE is going to address the different standards and how they are going to be standardized. There is a real need to standardize all of these items if we want companies (mainly vendors) to focus on building meaningful and interesting analysis capabilities, instead of spending all their time on normalizing log files, building connectors, and trying to interpret the meaning of log messages.

I am posting this in lieu of the official launch of the CEE Web site!

Common Event Format - Add-on

logo_splunk.gifThe common event format (CEF) is a standard for the interoperability of event- or log generating devices and applications. The standard defines a syntax for log records. It comprises of a standard prefix and a variable extension that is formatted as key-value pairs. The standards document is unfortunately only available if you register on the Web site. I wish ArcSight would post a link to the standards document, instead of making you register to download it. If you want more detailed information about CEF, check out an older post that I have written when I was still working on CEF.

I just wrote a CEF add-on for Splunk. It defines field extractions for CEF formatted messages. Just install the add-on, set your source type to cef and you will be able to use the extracted fields from your CEF messages. Note that because CEF has an extension that is all key-value pairs, I did not have to write any special extractions for that part. I only had to implement extractions for the prefix. Very slick!

IT Search vs. SIEM - Data Collection

lock1.jpgI have a lot of conversations lately about the topic of IT search versus SIEM (security information and event management), the more traditional way of doing security event management. People are asking me how Splunk’s technology is different from all the log management tools. With ArcSight (my former employer) going public, LogLogic going through some turmoil in their executive management, and Splunk that just got an amazing round of investment, people are very interested in understanding what the deal is.
The topics of SIEM and IT search are fairly similar. However, there are some very important differences that I want to start pointing out in a series of blog posts.

Let me start with the topic of data collection. In an SIEM system, you use a collector, a connector, or an agent (I don’t really care what you call it, but it’s some piece of code which reads the data and feeds it into the system) to process the data before you can use it in your SIEM for correlation, reporting, or forensic purposes. If you do not have a connector specifically written for your data source, you are out of luck. Just to be clear, I am not talking about having a connector for files, for ODBC, for SNMP traps, or for syslog over UDP/514. I am talking about a connector for each specific data source: Snort syslog, Snort database, CheckPoint OPSEC, CheckPoint syslog (do they have a syslog output?), PIX over syslog, CISCO router over syslog, etc.

What this means is that the SIEM has to either already support your data source or you need them to build you a connector; or you build it yourself. Most of the SIEM tools have some sort of an SDK that you can use for this purpose. However, do you have the man power and the skills in-house to do so? If not, does the SIEM company have the bandwidth to build your connector in acceptable time?

What happens, if the source data format changes? For example, Snort might slightly change its syslog format. Guess what has to happen. Yes! The connector needs to be updated to support the new format. This could mean a down-time of your data source of a few days, if you don’t plan accordingly and get an updated connector right away.

No connector - No data

Now, what is the deal in the IT search world? Well, you need some sort of connector as well. However, you only need one to transport the data from the data source into the search system. In other words, you need about a handful of connectors: One for ODBC, one for receiving syslog on UDP/514, one for text-files, and one for databases other than ODBC. (Okay, okay, I will add one for CheckPoint’s OPSEC). That’s it. You don’t need a specific connector for each data source. You also don’t have to update every time the data source decides to slightly change the logging format. [And if you think that never happens, have a look at SiteProtector.]

What does this mean? It means that from day one that you install your IT search technology, you are able to work with your logs. You don’t have to wait until the right connector is available.

So much for now. In my next post I will talk about structured data.