Identifying Phishing Sites in Your Events

Recently, I thought I was caught in a phishing scheme where I created an account on an e-commerce site to checkout and as soon as I clicked on the checkout button, it asked me to log onto a well known site. It turned out that the original site was badly implemented and it should have told users that they are affiliates with the other site. Nevertheless, I went to Phishtank to make sure that no one had complained about the original e-commerce site.

This got me thinking that since phishing occurs all too often, there must be a way for a corporations to verify that their users are not going to phishing sites and if they are to know…

» Continue reading

Asking Vendors to Make Log Events Accessible

In my last blog entry, I wrote about asking vendors to make their log event formats follow industry best practices. Now, if the log events reside in files or can be broadcast out on network ports, this makes it quite easy to access them with technologies such as Splunk Universal Forwarders. However, if the log events are buried deep within the application, device, or system that created them, then there is is one more issue to address to get to the events and that is having an accessible transport mechanism with examples on usage.

By transport, I obviously am not referring to some futuristic vehicle transportation.

What I am talking about is a way for one computer…

» Continue reading

My Data takes me back to HD Videos

Last month I wrote about indexing video feeds and Vimeo was the site I featured for HD videos. The idea was to use the Vimeo REST API to gather all the meta data about your favorite Vimeo HD video channels and then index this into Splunk for historical look up or simply to have it available as a one stop dashboard where you can not only view the information that got indexed, but also use a workflow action to actually view the video.

Then, what happened was that the REST API called from Python changed in that I was getting one huge line per channel instead of nicely formatted XML. My code had logic to skip all lines…

» Continue reading

RSS Inputs and Also the Splunk Java SDK

By now, some of you over the years may have downloaded from Splunkbase my reference implementation for using scripted input to index RSS feeds or have read about the topic. The idea is that this input is very low in daily volume (possibly in KBs/day as opposed to MBs/day), but presents itself with many different correlation opportunities from the same Splunk console. This was originally written in Python and used the publicly available feedparser.py to download and parse the RSS feed. The issues I have heard over time are some people are not allowed to install Python on a forwarder machine, have the wrong version of Python that may not work with feedparser.py or simply have issues…

» Continue reading

Indexing Feeds

We often talk about indexing the output of a program or script in Splunk as an universal way to index any type of text data that goes beyond monitoring log files. For those of you who may be new to Splunk, the idea behind a scripted input is that every configurable N seconds, any user provided script or program written in any language can be called by Splunk or a forwarder that gathers data and the standard output of that script or program is then indexed by Splunk. This is the basis for many of my contributions on Splunkbase such as indexing RSS feeds. Logically, this is a “pull” approach in that data is accessed by Splunk on…

» Continue reading

Indexing Video “Playlists” in Splunk

In my last blog post entry, I talked about indexing radio stations’ playlists and described my reference implementation. This brings up a question whether the same approach can be used for indexing playlists for videos, not just songs. The answer is yes. One thing to keep in mind is that most people don’t spend time wondering what was the last video played on a certain web site or cable channel so that they can purchase it. In other words, discovering new videos on TV channels is not as popular an activity as discovering new songs on the radio. Nevertheless, it is a popular activity on the web. To try this out, I created two reference implementations that you…

» Continue reading

Monitor Radio Station’s Playlists

At a recent Strata conference on Big Data, someone asked me if Splunk can be used to monitor what songs radio stations are currently playing and do analysis on the playlists. The answer is not only yes, but I happen to have created a reference implementation of this on Splunkbase. You can download the Monitor Radio Stations app for free and use it with your Splunk installation.

I use the YES API in a scripted input to gather events for each radio station that is being monitored in an user configurable 5 minute interval. The output of the scripted input is a readable XML format that lists the station name, artist, song and timestamp of the…

» Continue reading

Correlating with Splunk IP Watchlist

Last year, fellow Splunker, Dave Croteau, created a prototype to daily index the world’s top 100 suspicious, or in some cases malicious IP addresses, by using a list created by the dshield.org web site. One thought is that these addresses may be compromised by trojans or botnets, so you would not want them to appear as sources connecting to your network. Dave also used the Splunk Maxmind add-on to show a simple dashboard that map’s these addresses to country and city with the Splunk Top command.

Next, I took this app and changed the scripted input to use curl to gather the data so the same approach could be ported to Windows as well as *Nix based machines.…

» Continue reading

Weather Alerts in Splunk

Its been a couple of years since I first created the current weather conditions app that is on Splunkbase, so I decided to do something similar that is a little more timely. Current weather conditions are nice events to index as they give a time line for how things are going at a particular location and provide a basis for trend analysis. However, they do not provide insight into upcoming severe weather, which are more important events to track.

Fortunately, the weather underground provides a REST API to gather severe weather alerts using a zip code. I built a scripted input Python script to gather these alerts and the standard output of each call is indexed…

» Continue reading

Indexing data into Splunk Remotely

Data can reside anywhere and Splunk recognizes that fact by providing the concept of forwarders. The Splunk Forwarder will collect data locally and send it to a central Splunk indexer which may reside in a remote location. One of the great advantages of this approach is that forwarders maintain an internal index for where they left off when sending data. If for some reason the Splunk Indexer has to be taken offline, the forwarder can resume its task after the indexer is brought back up. Another advantage to forwarders is that they can load balance delivery to multiple indexers. Even a Splunk Light Forwarder (a forwarder that consumes minimal CPU resources and network bandwidth) can participate in an

» Continue reading