Raffy: Archive for the 'Homepage' Tab

Maturity Scale for IT Data Management

This following blog post has turned into more than just a post. It’s more of a paper. In any case, in the post I am trying to capture a number of concepts that are defining the IT data management market.
When I am talking about IT data management, I am talking about the over-arching market that covers anything from log management to security information management and security event management.
Any company or IT department/operations can be placed along the maturity scale (see Figure 1). The further on the right, the more mature the operations with regards to IT data management. A company generally moves along the scale. A movement to the right does not just involve the purchase of new solutions or tools, but also needs to come with a new set of processes. Products are often necessary but are not a must.
The further one moves to the right, the fewer companies or IT operations can be found operating at that scale. Also note that the products that companies use are called log management tools for the ones located on the left side of the scale. In the middle, it is the security information and event management (SIEM) products that are being used, and on the right side, companies have to look at either in-house tools, scripts, or in some cases commercial tools in markets other than the security market. Some SIEM tools are offering basic advanced analytics capabilities, but they are very rudimentary. The reason why there are no security specific tools and products on the right side becomes clear when we understand a bit better what the scale encodes.

Figure 1: IT Data Management Maturity Scale.

The Maturity Scale

Let us have a quick look at each of the stages on the scale. (Skip over this if you are interested in the conclusions and not the details of the scale.)

  • Do nothing: I didn’t even explicitly place this stage on the scale. However, there are a great many companies out there that do exactly this. They don’t collect data at all.
  • Collecting logs: At this stage of the scale, companies are collecting some data from a few data sources for retention purposes. Sometimes compliance is the driver for this. You will mostly find things like authentication logs or maybe message logs (such as email transaction logs or proxy logs). The number of different data sources is generally very small. In addition, you mostly find log files here. No more specific IT data, such as multi-line applications logs or configurations.
  • Forensics / Troubleshooting: While companies in the previous stage simply collect logs for retention purposes, companies in this stage actually make use of the data. In the security arena they are conducting forensic investigations after something suspicious was noticed or a breach was reported. In IT operations, the use-case is troubleshooting. Take email logs, for example. A user wants to know why he did not receive a specific email. Was it eaten by the SPAM filter or is something else wrong?
  • Save searches: I don’t have a better name for this. In the simplest case, someone saves the search expression used with a grep command. In other cases, where a log management solution is used, users are saving their searches. At this stage, analysts can re-use their searches at a later point in time to find the same type of problems again, without having to reconstruct the searches every single time.
  • Share searches: If a search is good for one analyst, it might be good for another one as well. Analysts at some point start sharing their ways of identifying a certain threat or analyze a specific IT problem. This greatly improves productivity.
  • Reporting: Analysts need reports. They need reports to communicate findings to management. Sometimes they need reports to communicate among each other or to communicate with other teams. Generally, the reporting capabilities of log management solutions are fairly limited. They are extended in the SEM products.
  • Alerting: This capability lives in somewhat of a gray-zone. Some log management solutions provide basic alerting, but generally, you will find this capability in a SEM. Alerting is used to automate some of the manual trouble-shooting that is done among companies on the left side of the scale. Instead of waiting for a user to complain that there is something wrong with his machine and then looking through the log files, analysts are setting up alerts that will notify them as soon as there are known signs of failures showing up. Things like monitoring free disk space are use-cases that are automated at this point. This can safe a lot of manual labor and help drive IT towards a more automated and pro-active discipline.
  • Collecting more logs and IT data: More data means more insight, more visibility, broader coverage, and more uses. For some use-cases we now need new data sources. In some cases it’s the more exotic logs, such as multi-line application logs, instant messenger logs, or physical access logs. In addition more IT data is needed: configuration files, host status information, such as open ports or running processes, ticketing information, etc. These new data sources enable a new and broader set of use-cases, such as change validation.
  • Correlation: The manual analysis of all of these new data sources can get very expensive and too resource intense. This is where SEM solutions can help automate a lot of the analysis. Uses like correlating trouble tickets with file changes, or correlating IDS data with operating system logs (Note that I didn’t say IDS and firewall logs!) There is much much more to correlation, but that’s for another blog post.

Note the big gap between the last step and this one. It takes a lot for an organization to cross this chasm. Also note that the individual mile-stones on the right side are drawn fairly close to each other. In reality, think of this as a log scale. These mile-stones can be very very far apart. The distance here is not telling anymore.

  • Visual analysis: It is not very efficient to read through thousands of log messages and figure out trends or patterns, or even understand what the log entries are communicating. Visual analysis takes the textual information and packages them in an image that conveys the contents of the logs. For more information on the topic of security visualization see Applied Security Visualization.
  • Pattern detection: One could view this as advanced correlation. One wants to know about patterns. Is it normal that when the DNS server is doing a zone transfer that you will also find a number of IDS alerts along with some firewall log entries? If a user browses the Web, what is the pattern of log files that are normally seen? Patter detection is the first step towards understanding an IT environment. The next step is to then figure out when something is an outlier and not part of a normal pattern. Note that this is not as simple as it sounds. There are various levels of maturity needed before this can happen. Just because something is different does not mean that it’s a “bad” anomaly or an outlier. Pattern detection engines need a lot of care and training.
  • Interactive visualization: Earlier we talked about simple, static visualization to better understand our IT data. The next step in the application of visualization is interactive visualization. This type of visualization follows the principle of: “overview first, zoom and filter, then details on demand.” This type of visualization along with dynamic queries (the next step) is incredibly important for advanced analysis of IT data.
  • Dynamic queries: The next step beyond interactive, single-view visualizations are multiple views of the same data. All of the views are linked together. If you select a property in one graph, the selection propagates to the others. This is also called dynamic queries. This is the gist of fast and efficient analysis of your IT data.
  • Anomaly detection: Various products are trying to implement anomaly detection algorithms in order to find outliers, or anomalous behavior in the IT environment. There are many approaches that people are trying to apply. So far, however, none of them had broad success. Anomaly detection as it is known today is best understood for closed use-cases. For example, NBADs are using anomaly detection algorithms to flag interesting findings in network flows. As of today, nobody has successfully applied anomaly detection across heterogeneous data sources.
  • Sharing views, patterns, and outliers: The last step on my maturity scale is the sharing of advanced analytic findings. If I know that certain versions of the Bind DNS server tend to trigger a specific set of Snort IDS alerts, it is something that others should know as well. Why not share it? Unfortunately, there are no products that allow us to share this knowledge.

While reading the maturity scale, note the gaps between the different stages. They signify how quickly after the previous step a new step sets in. If you were to look at the scale from a time-perspective, you would start an IT data management project on the left side and slowly move towards the right. Again, the gaps are fairly indicative of the relative time such a project would consume.

Related Quantities

The scale could be overlaid with a lines showing some interesting, related properties. I decided to not do so in favor of legibility. Instead, have a look at Figure 2. It encodes a few properties: number of products on the market, number of customers / users, and number of data sources needed at that state of maturity.

Figure 2: The number of product, companies, and data sources tat are used / available along the maturity scale.

Why are so few products on the right side of the scale? The most obvious reason is one of market size. There are not many companies on the right side. Hence there are not many products. It is sort of a chicken and an egg problem. If there were more products, there might be more companies using them - maybe. However, there are more reasons. One of them being that in order to get to the right side, a company has to traverse the entire scale on the left. This means that the potential market for advanced analytics is the amount of companies that linger just before the advanced analytics market itself. That market is a very small one. The next question would be why there are not more companies close to the advanced analytics stage? There are multiple reasons. Some of them are:

  • Not many environments manage to collect enough data to implement advanced analytics across heterogeneous data. Too many environments are stuck with just a few data sources. There are organizational, architectural, political, and technical reasons why this is so.
  • A lack of qualified people (engineers, architects, etc) is another reason. Not many companies have the staff that understands how to deal with all the data collected. Not many people understand how to interpret the vast amount of different data sources.

The effects of these phenomenon play yet again into the availability of products for the advanced analytics side of the scale. Because there are not many environments that actually collect a diverse set of IT data, companies (or academia) cannot conduct research on the subject. And if they do, they mostly get it wrong or capture just a very narrow use-case.

What Else Does the Maturity Scale Tell Us?

Let us have a look at some of the other things that we can learn from/should know about the maturity scale:

  • What does it mean for a company to be on the far right of the scale?
    • In-depth understanding of the data
    • Understanding of how to apply advanced analytics, such as visualization theory, anomaly detection, etc)

Security Predictions for 2009

It is the time of the year where everyone publishes their predictions for the upcoming year. In past years, I have refrained from publishing my own predictions. This year I am going to change that and I will take a stab. I don’t have any earth shattering things to say and I am covering quite a broad set of topics. Anyways, maybe you find one or two interesting things:

  • Security and IT spending: Security projects have never been the ones that were easy to fund (except right after a big worm outbreak, which we haven’t had in years). With the current economical situation, the security budgets for 2009 are not going to be any easier to justify. Therefore, we will see a convergence of projects. Security is going to piggy-back on other IT projects, for example, change management. CM is an integral part of a lot of security requirements, such as PCI. Visibility into the IT infrastructure is another project that will help fund security. SIM, SEM, SIEM, or ESLIM (no kidding, this exists! It wasn’t me. Blame the 451 group!) will need to extend their messages and capabilities to show how they can help provide visibility into the complete IT environment. IT search is going to be especially well situated for that.
  • Security ROI: Calculating an ROI for security is hard. It’s an often discussed topic among security experts. 2009 is not going to give us yet another formula to compute the ROI. However, as mentioned earlier, security will be used as an opportunity to optimize IT. Questions like: “How can you do more with less?” will be used to compute an ROI. A lot of companies have consolidation on their agendas for the new year. 75% of the solutions and tools will be eliminated. The tasks of those tools will have to be covered with the remaining 25%. A great opportunity for security monitoring tools to broaden their footprint.
  • Metrics: 2008 was supposed to be the year of risk management. I didn’t feel much of that. Or have you seen a push in risk management products? 2009 is going to be the year of metrics. People have to measure things. Not necessarily pure security metrics, but IT metrics, such as productivity, resources, MTTI, etc. Products will have to show actual, measurable benefits. It’s all about cost and how to reduce it. Without metrics you cannot assess how much a tool helps you safe.
  • More visibility: It is amazing, but a lot of companies don’t even know what assets/machines they operate. How can you do anything without that information? And that is just the top of the ice berg. IT needs more visibility. What is running where? How well are things running? How efficient? This plays into Green IT also, where you need to know how well servers are utilized how much power they consume and what the temperature is across the data center. Visibility also includes things like identity management. We need to know who executed a task or committed a transaction. It’s not of much use if we know that a certain machine attacked us. We want to know who is behind the activity. The question in 2009 is going to be how to integrate your asset management and IdM into your monitoring infrastructure.
  • Consolidation: We have seen acqusitions happen all through 2008. There will be much more. Just along the lines of the security initiatives being coupled closer and closer with IT initiatives, products/companies will be merging.
  • Visualization market/tools: What will be going on in the visualization market? Not too much. People are not ready. A lot of companies are still struggling with centralizing IT data. They are starting to use the data to troubleshoot problems. Beyond that, advanced analytics, such as visualization, are not commonly used yet. On the brighter side, new tools will enter the market. DAVIX will come out with a new release, hopefully early 2009. This will help make visualization available to the broader masses. The new release is going to have Splunk integrated, which should help manage all the IT data! In addition, a slew of new visualization tools will be available in the distro. Hopefully, this will help broaden the security visualization community.
  • Interoperability: This is a topic that I am fairly passionate about. I have been doing quite a lot of work on the topic of how to get machines to talk to each other through events, logs, and generic IT data. Recently a new syslog RFC was published. I was much too late to actually comment on it. It has good intentions, but it is definitely not what I would like it to be. CEE is still alive, despite the lack of new publications. 2009 will bring us at least one release of one of the sub-standards. If I had to take a guess it would be the syntax and accompanying dictionary. Well, maybe just the dictionary. And definitely will we start collecting log recommendations. That will happen very soon now!
  • No data sets: Over many years, we have been facing a huge problem in the research arena. Nobody has solved it yet. It’s the problem of data sets. Researches need data sets to verify their algorithms and approaches. Guess what, 2009 will not solve this. Unless someone comes up with a really great way of anonymizing data, data sets will not be shared. People are not sharing their logs without being absolutely sure that there is no confidential data leaking. I have a feeling that will we be able to solve this only with cryptography. Something along the lines of secure voting schemes, where the analysis would happen on encrypted data. But how do you do that? I have no idea. Until then, people will keep doing verification and analysis on synthetic, old, and irrelevant data sets.

SIM is Dead - Unless

I feel like I should post a follow-up to my recent post about SIM is dead. Here are some points I would like to clarify:

  • If I talk about SIM or SIEM, I am talking about the way current SIM solutions are working and the way they are implemented. That means things like relational database, fixed schema, parsed and normalized data, or hierarchical scaling.
  • Do I really believe that SIM is not useful? No. And I am not just saying that because I own stock in a SIM company. Just like Alex says in a comment on my original blog entry: IDS is not dead. SIM is probably not dead either. I know of quite some people that are very happy with their SIM implementation. However, there are many limitations with the way today’s SIMs are architected.
  • The architectural limits cripple the SIMs. They cannot deal with really large event volumes. With the current threat landscape this means that many use-cases cannot be implemented with a SIM. They simply can’t scale to that extent. Leverage IT search to do the heavy data lifting.
  • Network world published a review of recent SIEM technology. They note correctly that application data is becoming more and more important. SIMs have traditionally been built for firewalls, intrusion detection systems, and vulnerability scans and that’s what they are really good at. To be precise. That’s where some SIMs are really good. But as soon as you are dealing with other data sources, such as call detail records (CDRs) or other crazy application logs, you start overloading the existing schema, apply one hack after the other and eventually cripple the entire system.
  • Some SIMs have done a great job of implementing features that are well-suited for security operations centers (SOCs). In these environments, analysts are working on a console 7×24. They need features like workflow, collaboration, ticketing, live channels, etc. In such an environment, a collaborative approach between a SIM and an IT search solution can be quite effective. IT search is dedicated to data management, data routing and collection, and forensic investigations, as well as reporting. The SIM can be dedicated to real-time correlation, collaboration, and providing a front-end for the analysts.

This should clarify some of my points.

Malicious Insider Holds SF Computer Systems Hostage

What do you do if your system administrator locks you out of your critical systems, changes the root password and then quits? If you haven’t thought about this, you are not the only one. San Francisco officials are facing exactly that question. A disgruntled employee locked out all the system administrators from some fairly critical systems, as you can read in the San Francisco Chronicle.

Insider crime is an area in computer security that still doesn’t get much attention. One of the problems is that the frequency of incidents is fairly low and therefore the problem rates low on a company’s charter. However, the big problem is that the average cost of such an incident is really high. In reality, companies are still struggling with protecting their perimeter. They are worried about outside attackers, script kiddies, about their competition breaking in, attacks of Chinese hackers, Russian crime rings, etc. They should balance their efforts to protect from these threats as well as from malicious insiders.

In this specific case, there were some very obvious signs that should have been noticed. The employee should have been on a watch-list and his activity should have been under review. He was about to be fired. This should have put him into a group of people that are monitored closely. Monitoring is not easy. It is all about people and processes and a little bit about technology. There is unfortunately no software or security tool out there that could detect an insider. And there will never be one.

As I point out in my book, you need to define a process that classifies employees. People on a watch list need to be monitored more closely.  Audit records need to be recorded, especially for privileged activities (such as the ones executed by system administrators). Those records then need to be stored in a place where nobody can tamper with them (for example in Splunk). The records then need to be reviewed on a regular basis. Hopefully by a separate team. Ideally the reviews are automated to ease the work load (for example through alerts in Splunk).

A second step has to be the implementation of proper security processes. Separation of duties, for example. The system administrator by himself should not be able to alter all the passwords necessary to access a system. In reality, this is really hard to enforce. However, if the preventative control cannot be enforced, a detective control should be put in place. Firstly, system logs should be centrally collected and analyzed, and secondly, the file systems should be monitored for changes. That way, all changes can be reviewed to see what the exact impact of Terry’s actions was.

Traditional computer security attacks are violating policy. Specialized sensors can be developed and deployed to monitor for signs of attacks. Insider crime is often executed without violating any policy. For example, a system administrator has the right to change passwords. However, as in San Francisco’s case, Terry abused that privilege to lock everybody out of the machines. The net is that one has to monitor not just violations or obvious attacks, but also regular and seemingly benign activity. This results in a huge amount of data from a lot of different sources. Make sure you have a solution that can deal with all of it.

An Interesting side fact: The department of technology is worried about a third-party accessing the systems with Terry’s account. This is definitely the time where Splunk needs to be in place to monitor all the records to check for any account access. This information can then be used by law enforcement to take action.

This article: “San Francisco Hack: Where Was the Oversight?contains some of my comments about the case.

Security Information Management (SIM) is dead

Pretty much exactly 5 years ago, in June 2003, Gartner declared Intrusion Detection Systems to be dead. Before Gartner can do so, I will state that SIM is dead.

The crime landscape has shifted. We used to be worried about network layer attacks, TCP/IP attacks where funky flags were crashing your systems. This is gone. We really don’t worry about them anymore. We have systems to stop these attacks. The crime has shifted up to the application layer. There are attacks over instant messaging, there are SQL injections, there are application layer attacks. You have to start monitoring the application layer. Compliance requirements are shifting too. For example, the PCI DSS 1.1 requires the usage of application layer firewalls by June 2008. Applications need to be verified for vulnerabilities and not just the platform.

Some of the problems I see with Security Information Management are (the first four are adapted from the Gartner IDS press release):

  • False positives in correlation rules
  • Burden on the IS organization by requiring full-time monitoring
  • A taxing incident-response process
  • An inability to monitor events at rates greater than 10.000 events per second
  • High cost of maintaining and build new adapters
  • Complexity of modeling environment

However, the biggest problem lies in the fixed event schema. SIMs were built for network-based attacks. They are good at dealing with firewall, IDS, and maybe vulnerability data. Their database schema is built for that. So are the correlation rules. Moving outside of that realm into application layer data and other types of logs can get hard. Fields don’t match up anymore and the pre-built correlation rules don’t fit either.

We need a new approach. We need an approach that can deal with all kinds of data. An approach that deals with multi-line messages, with any type of fields, even with entire files as entities. There is a need for a system that can collect data at rates of 100.000 events a second and still perform data analysis. It needs to support large quantities of analytical rules, not just a limited set. The system needs to be easy to use and absorb knowledge from the users.

The solution is called IT search.

Applied Security Visualization - First Proofs

picture-5.pngYesterday marked yet another milestone in my life as an author. I got the first 5 chapters of my book back from production. The Applied Security Visualization book is slowly coming together. After working on the book for one and a half years, it is great to finally see how the book is going to look. The graphs are placed on the pages and the layout is done. It finally feels like a real book. The book will be out by BlackHat at the beginning of August.

.

.

You can pre-order the book on Amazon. It is about 400 pages and contains the following chapters:

  1. 1. Visualization
  2. 2. Data Sources

  3. 3. Visually Representing Data

  4. 4. From Data to Graphs

  5. 5. Visual Security Analysis

  6. 6. Perimeter Threat

  7. 7. Compliance

  8. 8. Insider Threat

  9. 9. Data Visualization Tools

The book ships with a live visualization CD. DAVIX, the data analysis and visualization UNIX, contains all the visualization tools discussed in chapter 9. They are all readily installed so you can use them to visualize your own data. No need to go through any crazy installation processes. The Web site for DAVIX is going to be ready by BlackHat, where we will officially launch DAVIX. If you are interested in a pre-version, drop me an email.

IT Search - A New Approach to Payment Card Industy (PCI) Compliance

pci.jpg The payment card industry data security standard, PCI DSS for short, was developed by the credit card industry to address data theft. The standard consists of twelve security requirement. Anything from traffic policies to requirements around anti virus software are covered by the standard.

If you are a company that does more than 20.000 transactions per year, you will have to implement the twelve requirements. If you are doing less, you will get away with a quarterly vulnerability scan.

IT search, Splunk, can directly address some of the areas and indirectly address most of the others. Specifically the areas where IT search assists are the following:

  • Log management (PCI requirement 10)
  • Secure & Central Log Collection (PCI requirement 10.5)
  • Audit Trail Retention (PCI requirement 10.7)
  • Daily Log Review (PCI requirement 10.6)
  • Secure Remote Access (PCI requirement 7.1)
  • File Integrity Monitoring (PCI requirements 10.2.2, 11.5 and 10.5.5)
  • PCI Control Reporting*

The Splunk for PCI application can be downloaded from SplunkBase. It provides a set of 91 searches and 57 reports, a dashboard, and a set of alerts that can be used to monitor the control objectives. The application makes use of Splunk’s IT search capabilities to address PCI. IT search has some very unique capabilities and is uniquely positioned to address PCI compliance:

  • satisfy ad-hoc requests form auditors
  • do large-scale reporting and investigations
  • automate control objective monitoring
  • add new control objectives and policies that require flexible monitoring and correlation capabilities
  • support ever changing data sources
  • re-use already collected data
  • incorportate file monitoring (not just traditional one-line log messages)

The Splunk for PCI application also gives you a capability to implement compensating controls for some of the PCI requirements. Also make sure to check out the daily log review process that helps you very easily tackle requirement 10.6.

Splunk is serious about PCI compliance: We are now part of the PCI Council. This is going to ensure that we know about upcoming changes to the PCI standard ahead of time and we can help influence future direction of it.

Splunk Fights Phishing

images.jpgThis morning, there was yet another case of phishing that was reported by the New York Times. This phishing incident, Larger Prey Are Targets of Phishing, is interesting because of the victim demographics: executives of large companies. As I just learned, this is also referred to as whaling. We have all seen phishing emails that tried to lure us into logging into our PayPal account. But an email from the United States District Court in San Diego that has a very authentic look is a different story. Would you fall for it?

The best way to address phishing is to educate users to make sure they don’t give out personal information. Have a look at the AntiPhishing Working Group’s phishing checklist that contains a lot of specific tips to prevent successful phishing attacks.

Splunk can addresses a couple of use-cases surrounding phishing attacks:

  • Detecting, after the fact, whether someone in your company fell victim to the scam (phishing).
  • Protecting your company from being phished. (In today’s story, the United States District Court in San Diego)

Detecting Phishing Victims

Once you know about a phishing attack, you can use Splunk to figure out whether anyone in your company has fallen victim. There are a few ways to do so, depending on the attack vector:

  1. The phish infects the victim and installs a trojan that starts leaking information.
  2. The phish uses a Web site to collect victims’ personal information (such as credit cards)

Both of these infections will start communicating with the outside. In the case of the phish reported today, the computers started communicating with machines in Singapore. By analyzing the traffic patterns and figuring out where in the world connections are being made to, this infection can be detected very easily. The Splunk reporting is a great way to quickly generate traffic reports and isolate traffic patterns based on geographic locations of the communicating machines. If , for example, your normal access pattern looks like the first graph and then after some time, you get the result of the second picture, where China suddenly shows up at second position, there might be something wrong.

Normal traffic patterns hitting Web site:

normal_web.png

Suspicious traffic pattern hitting Web site. Note China on second position:

picture-6.png

Protecting Your Company From Being Phished

If you are operating a Web site, you should try to make sure that there is nobody trying to phish it. There are a couple of ways that IT Search can help you with this:

  • Monitor your Web server logs for non-complete session requests. A lot of phishers request images from your site, but not the original site itself (the HTML page).
  • Monitor Web server logs for sessions that directly send a login, without ever requesting the login page itself. This happens when the victim logged into the phishing site and the credentials are passed to the real site, making everything look normal for the victim.
  • Check DNS lookups and see whether you get a lot of lookups from one single machine. This is tricky and you need to know the baseline of lookups, but spikes might turn out interesting to investigate.

Here is a search in Splunk that you can use to determine whether someone posted credentials without ever requesting the login page:

sourcetype=access_comined (login_form.php OR sales.php) | stats count by clientip | search count=1

This assumes you have a page, sales.php, which you can only access once you logged in via the login_form.php. For more complicated Web site architectures, you will have to build a more sophisticated search that uses transactions, but more on that another time.

All the Data That’s Fit to Visualize - SOURCE Boston 2008

img-62_t.jpgI was giving a talk at SOURCEBoston 2008. The topic this time was around general visualization and what has gone wrong in security visualization in the past. I showed how we can learn and steal from other disciplines, in this case, the New York Times. The NYT has done some pretty fantastic work in the area of data visualization. Their interactive market map, for example, is a great way of exploring stock data. During the talk, I outlined some of the design principles that the NYT graphics department is using when they are designing their graphs: Show - Don’t Tell.


To start my presentation, I showed a little video about security visualization (see below).

2340391938_67b956ed2e.jpgAt conferences lately, I find myself not to be the only one that talks about security visualization. More and more presentations are showing visualizations. A lot of projects are using visualization to help them analyze all the data at hand. At SOURCE, Dave Dittrich from the University of Washington, talked about BotNet analysis and visualizing network traffic captured from BotNets. He definitely has a challenge of displaying large amounts of data. We discussed some approaches and possibly, parallel coordinates, could work for his data. Parallel coordinates are what I used in my book for some BotNet traffic analysis.

Common Event Syntax

cee-logo.gifAs part of the common event expression (CEE) effort, a list of field names has been published.

If log records from different log sources have to be correlated or reports have to be generated across different log sources, a common set of field names is needed. Take a firewall log example. Assume that you have two types of firewalls in your environment: Netscreen and PIX. Both devices write different types of log entries. Assume you have a parser that extracts fields from the two logs. Each of the parsers might call fields differently, making it either impossible, or really hard to correlate these two log files. Just think about reporting. How do you find the top source addresses across both logs? These are logs from each of the firewalls:

Netscreeen:

May  5 17:01:40 45.2.0.1 NOC-FWa: NetScreen device_id=NOC-FWa [Root]
system-notification-00257(traffic): start_time=”2006-05-05 17:01:40″
duration=0 policy_id=52 service=tcp/port:26212 proto=6 src zone=backbone
dst zone=noc-mgt action=Deny sent=0 rcvd=0 src=222.81.119.59dst=45.2.121.102
src_port=7000 dst_port=26212

Pix:

Jan 18 12:43:50 192.168.1.1 %PIX-6-106015: Deny TCP (no connection)
from 208.58.193.69/1062 to a.b.c.d/443 flags ACK

If you report on “src”, you won’t get the “from” from the PIX log. We need unified names.

It is not just important to have a common set of names, but also a common understanding of what individual fields mean. What is the semantics of a field? For example, how do you measure a duration? In seconds? Hours? Days? What is a destination host? Is it fully qualified or just the host name itself? The field list, which can be found in this post: CEE Fields List is a first step towards standardizing this.

Note that, for example, ArcSight’s CEF publishes a dictionary along with their log syntax. The CEE field list can be used to standardize the names across various log formats and can hopefully substitute and expand ArcSight’s dictionary.