Today we will cover the last class of analytics for finding that needle of information in an ocean of big data—prescriptive analytics. Remember information << data—the information anyone can extract from big data will always be much less than the sheer volume of the big data itself. The difference is even more dramatic if we are talking about relevant and useful information.
Prescriptive analytics not only predicts a possible future, it predicts multiple futures based on the decision maker’s actions. Therefore a prescriptive model is, by definition, also predictive.
Last time we described the simplest class of analytics (i.e. descriptive analytics) that you can use to reduce your big data into much smaller, but consumable bites of information. Remember, most raw data, especially big data, are not suitable for human consumption, but the information we derived from the data is.
Today we will talk about the second class of analytics for data reduction—predictive analytics. First let me clarify 2 subtle points about predictive analytics that is often confusing.
The purpose of predictive analytics is NOT to tell you what will happen in the future. No analytics can do that.
Predictive analytics are not limited to the time domain. Some of the most interesting predictive analytics in social media are non-temporal in nature.
Now that SxSW interactive is over, it’s time to get back and do some serious business. For me, that means I’ll return to the world of big data. But let me tell you a little secret: although I work with big data all the time, I never actually look at any big data, because big data isn’t made for human consumption.
No one can make any sense out of direct examination of petabytes of data; not even analysts or data scientists. You can’t even plot them on the monitor, because even the highest resolution monitors are nowhere near a petapixel. We may look at several small samples of the big data during exploratory data analysis (EDA), but that’s not big data per se, since that is just a tiny fraction of big data. Frankly, I don’t know anyone who actually looks through the entire set of big data with their naked eye. Instead, we apply many sophisticated analytics to big data, and let our computers crunch it down to consumable digests. And that’s where we spend most of time—looking at the results of analyses.
By definition, an insight must provide something we don’t already know. However, we typically don’t know what we don’t know, so we can’t really look for insights, since we won't know what to look for if we don't know what it is a priori. What we need to do is to temporarily forget about the value proposition of the data analysis and look beyond what’s relevant to the immediate problem we are trying to solve. Although there is no guarantee that we will find anything in the land of irrelevance, but ironically that is usually where insights are discovered.
How exactly do we do this? That is the topic we will discuss today.