January 8th, 2008
IT Search vs. SIEM - Data Collection - Feedback
Steve posted a commentary to my blog post about IT Search vs. SIEM - Data Collection. I want to address some of his comments here, showing that IT search is more than a lot of people think!
- Steve writes: “Raffy mentions a small change in the syslog format causing the connector to break. Well syslog is a standard so if it would not break any standard syslog receiver, what it actually meant is that the syslog message has not changed but the content had.” - If I say that the “syslog format” changed, I mean the syslog message, the text. And yes, a changed message will break the specific syslog parser/connector. If you write a parser for, let’s say sendmail, you have a capability to extract all the fields that sendmail logs. If I change the sendmail message, your parser won’t work anymore. Hence, your connector will break and not parse the message. In the worst case not even collecting that message at all.
- “Log Management vendors provide “knowledge” about the logs beyond simple collection.” - Agreed. The parsers or field extractions are definitely knowledge. Same with IT search. There are field extractions (see for example splunkbase.com) that you can use to extract individual fields to report on them. It’s about the way you approach data collection. If you need a parser to start with, you won’t be able to collect the data that you don’t have a connector for. That was my whole point. Nothing else. There are other differences in search vs. SIEM, but that’s a topic for a future blog entry (which is overdue, I know).
- “What Log Management vendors do is to help you ( as the user) out with the knowledge – rules that categorize important event logs from unimportant ones, alerts, reports that are configured to look for key words in the different log streams.” - Yes. I was not talking about reports, searches, dashboards, etc. in my blog post at all. However, IT search is not different. It has reports, searches, alerts, tags, classifications, etc.
- “In IT Search, there is no possibility for anything to get out of date mainly because there is no knowledge, only the ability to search the log in its native format.” - Not true at all. The question is where you impose the log format. If it’s at collection time, you run into all the problems that I talked about in my previous post. If you are imposing the schema at search time, as IT search is, you get pretty much the same benefits, but a few more (dynamic schemas, multiple name-spaces, etc.) And yes, this information is prone to get out of date, but hence the dynamic approach!
- “Finally, if a Log Management vendor is storing the original log and you can search on it, your Log Management application gives you all the capability of IT Search.” - Well, sure. But would you say that searching your documents with grep is better than using a search engine like google? I guess not. Same with IT search, which is built for quickly and efficiently searching logs, versus storing log files and grepping through them. The search language that you can use is another factor. You cannot just do simple searches but all kinds of operations on the data - statistics, conversions, comparisons, etc. You are not comparing apples with apples.
Note also that Steve only addressed a subset of my issues. I hope you realize that IT search is more than just searching your log files!


January 24th, 2008 at 2:17 am
Hey Mr Security Visualization… I just posted this comment on Steve’s blog, maybe he’ll come around. Heck, he should just download Splunk. I love it.
“To Raffy’s point about Syslog formats changing, I believe he may have meant that if information is being written to a syslog, a connector may be expecting a certain number of fields to parse out so the user can do remedial searching or reporting. If that field format changes by the application writing to syslog, the connector may have to be reconfigured or modified in some way. Tools like Splunk (wait, is there anything out there like Splunk?), just index everything and provide true search, and some cool technology that is way more malleable than brittle connector and fixed format based approaches.
I think the most important point that you are missing is most IT people supporting applications DON”T look in syslog for the real meat of whats going on. Standard places to look, such as Event Logs and a Syslog only contain a small amount of the data to help resolve application outages and other issues The golden nuggets are in the complex multi-line formats that .NET, J2EE, Weblogic, Websphere, Apache, Oracle and other application developers write out to the file system as a record of everything that happens in an application. The poor first and second level support guys in a company that has a decent size infrastructure are left with a lot of log data, in wide and varying formats all over their production systems.
Finally, the incredible value of IT search in general is not only in its ability to index, but search. Search allows us to link events together because not only has it already indexed everything regardless of format, but search languages allow the expression of the search query in a more flexible manner to the user.
In my experience as a sysadmin, log management is more about lassoing and storage (hence the word “management”). IT search is about letting IT people get away from being “tool operators”, giving them technology to get work done, and show their peers and organizations just how damned good they are. “