Your most important IT data: funny quotes
bash.org is a natural dataset for splunking. It’s a huge blob of loosely structured text data, and it’s made of win.
To play with a live instance, go to bash.splunklabs.com, login: guest, password: guest.
Of course, Splunk duplicates the functionality of the site itself. We can find, for example, the top 100 IRC quotes:
Splunk lets us do considerably more, though. What are the top one-liners?
How many more quotes mention “girlfriend” than “boyfriend”, i.e. exactly how bad is this sausage party?
Are there any commonly quoted individuals?
Are there any interesting trends in quote scores over time? Take a look at high quote scores vs. quote ID:
It seems likely that older quotes, especially good ones, benefit from …