Total Data: Size Doesn’t Matter
Matt Aslett, 451 analyst extraordinaire, has unleashed a new term on the world, “Total Data.” The inspiration comes from two sources: 1. Aslett’s dissatisfaction with the nebulous term “Big Data” and 2. the Dutch “Total Football” (totaalvoetbal) system, pioneered in the 1970′s with the help of footballing wizard Johann Cruyff. Being a sucker for technology articles that make judicious use of football analogies, I naturally aimed to digest his post in its totality (ha!).
My initial thought, I must admit, was “Really, Matt? Total data? Really?” Football analogy or not, it sounded hokey and a tad contrived (“My total data is bigger than your big data.”) But then a funny thing happened – the term grew on me, especially once I understood where he was coming from. His point, if I may take the liberty of boiling it down thusly, is that size doesn’t matter and that big data is actually a misnomer, because of one key element: as the term is used today, it seems to be a subset of all data, not a superset as the name implies. Specifically, 451 defined big data as “…a term applied to data sets that are large, complex and dynamic (or a combination thereof) and for which there is a requirement to capture, manage and process the data set in its entirety, such that it is not possible to process the data using traditional software tools and analytic techniques within tolerable time frames.” Much of the confusion stems from the fact that big data doesn’t appear to refer to traditional data storage and management, although the solutions in that category certainly do manage rather large data sets. So, Aslett’s reasoning goes, if you exclude traditional databases, storage, MDM, BI and the like, you’re left with a market subset that’s actually quite small.
Hence “total data” – defined by 451 as “a broader approach to data management, managing the storage and processing of big data to deliver the necessary BI.” As terms of art go, this one might have staying power. However, “total data” does not refer to the totality of data, which I suspect might be one’s interpretation at first glance (it certainly was mine). This is where the Total Football analogy comes in. Total football was about guys on the field playing multiple positions and popping up in multiple locations, depending on where the ball was at any given time. It was more about directional flow than remaining in a static position on the field. Total data, as defined by Aslett and 451, seems to be about positional data or data vectors. The exact content of the data doesn’t seem to be as important as where the data has been, where it’s going, and for what purpose. If I may, and I apologize for this ahead of time, it’s not the size of the data, it’s what you do with it that matters.
At Splunk, we have never defined ourselves as “Big Data” because, although our customers’ data sets are indeed “big”, it never seemed to fully encompass what Splunk is about, which is making connections between disparate data streams, visualizing them, and analyzing those connections to draw conclusions. To us, a term like “total data” seems to be a much better description of the world in which we live, because no matter the source or size of the data, chances are it can be splunked.