Steps for implementing Fraud Detection
A couple of years ago, I wrote about how easy it is to detect fraud, mostly in the financial services industry, using Splunk Enterprise in a blog article. What I provided were the last steps on using the Splunk Search Processing Language to accomplish the task. However, for most people, who are new to Splunk, that doesn’t really help as it only gives you a prescription after you’ve uncovered the symptoms and, should I say, possible disease.
Today, I’d like to step back a little bit and give you the full high level steps on implementing fraud detection for your needs. This may make the previous article a little more clear.
Understand Your Use Cases
Before you do anything, …
Please Bypass the Database
It has been a while since I posted to these pages and I am sure there may be one or two of you who misses my erudite musings or as some may say ramblings of a longtime Splunker. Either way, here’s my first post for 2015.
I have noticed that there are a quite a few deployments in the world that write time series data to a log rotated file and have another process translate those events into a rows and columns to be ingested into a relational database. After this extract, translate, and load process (ETL), they then use SQL to gather their database records either add-hoc search or for aggregate reporting. This practice has been going on for …
Updated Traffic App
A few years ago, I created a publicly available traffic app for monitoring traffic incidents in major US cities configured by user. Since then, the provider of the feed has cut down on the number of cities they monitor and no longer provide incident counts per intersection. Nevertheless, they still provide a Jam Factor. A Jam Factor is a subjective number provided for a roadway that indicates how busy (or jammed) the roadway is.
For my reference implementation, I used this Jam factor field to visually allow you to to see your city’s (assuming the provider covers it) current Jam Factor for major highways. This updated traffic app that you can download has new dashboards that you can use to …
Updated Keyword App
Last year I created a simple app called Keyword that consists of a series of form search dashboards that perform Splunk searches in the background without having to know the Splunk search language. You can read about the original app here and see how it easy it is to use. This year, I added some dashboards for the Rare Command, but I didn’t think it was newsworthy to blog about it.
Then, Joe Welsh wrote a blog entry about using the cluster command in Splunk, which allows you to find anomalies using a log reduction approach. Joe’s example using Nagios is easy to follow and gives the novice a useful approach to get rare events. So, using this approach, I …
Splunk as a Recipient on the JMS Grid
A number of years ago, I was fascinated by the idea of SETI@home. The idea was that home computers, while idling, would be sent calculations to perform in the search for extraterrestrial life. If you wanted to participate, you would register your computer with the project and your unused cycles would be utilized for calculations sent back to the main servers. You could call it a poor man’s grid, but I thought it of it as a massive extension for overworked servers. I thought the whole idea could be applied to the Java Messaging Service (JMS) used in J2EE application servers.
Almost a decade ago, I would walk around corporations at “closing” time and see a mass array …
Another NY Metro Splunk Users Group Meeting
We had our first NY Metro Splunk Users Group meeting of the year this week and it was hosted at Blackrock in NYC with Reed Kelly, one of the leaders of the users group playing host. Thanks Reed.
Our first order of business was to watch a presentation from Splunk Product Manager Jack Coates on the new 3.0 Splunk Common Information Model. Unlike the past CIM that focused heavily on security, the new CIM is general purpose for all of IT and flexible to add more knowledge to it, when needed. As a bonus, the app in the app store has data models to quickly get started and test your data sources.
Next, we had a discussion (or some …
Using Splunk as a data store for developers
A number of years ago, I wrote a blog entry called Everybody Splunk with the Splunk SDK, which succinctly encouraged developers to put data into Splunk for their applications and then search on the indexed data to avoid doing sequential search on unstructured text. Since it’s been a while and I don’t expect people to memorize the dissertations of ancient history (to paraphrase Bob Dylan), I’ve decided to write about the topic again, but this time in more detail with explanations on how to proceed.
Why Splunk as a Data Store?
Some may proclaim that there are many no-sql like data stores out there already, so why use Splunk for an application data store? The answers point to simplicity, …
Over the day in the life of a Splunk user, he or she probably utilizes less than 50% of the available Splunk commands. It may be that the most popular commands such as stats, transaction, eval, top, timechart, chart, etc are already sufficient enough to do the types of manipulation and reporting that is required for the use case. Another way to look at it is that the other commands are not being utilized because of their lack of high cardinally and hence popularity in the abundant Splunk blogs, documentation, wiki’s, and answers.
In order to provide more awareness for many of these commands that are not as prevalent in use for the Splunk community, the field engineers at Splunk …
I sometimes get asked if Splunk can detect fraud. The answer is yes, but the question is broad and needs an understanding of the situation that needs to be detected before making a generalization. Fraud here means using deceptive techniques for gains, which for the most part may be illegal. The two textbook ways to detect fraud usually involve pattern matching or statistical anomalies (or a combination of each).
Let me describe a real-life fraud detector. A few years ago, I used to work for an enterprise software company that used Mantas (which has since been acquired by Oracle) as a partner to detect money laundering activity. The software would load financial systems data into a database and run algorithms …
2nd NY Metro Splunk Users Group Meeting
The 2nd annual (hope to become quarterly) NY Splunk Users Group meeting was held yesterday at Tumblr headquarters with our host Mackenzie Kosut graciously allowing the group to use Tumblr’s meeting room for the meet-up. Splunkers Matty Settipane and Sean Blake were the guest speakers talking about large scale deployment.
In today’s Splunk architecture, the discussions are not just about indexers and search heads, but they also include search head pooling, deployment servers, job servers, external syslog collectors, cluster masters, roles, and inspecting searches for performance bottlenecks. If these topics are interesting to you, please contact fellow Splunk customers and Splunkers to form a discussion.
From the feedback we got, it looks like we’ll have another meeting in a few …