Big Data Reduction 3: From Descriptive to Prescriptive
Welcome back! Let me just make a quick announcement before I dive into big data. As some of you might know, in 2 weeks,LiNC (Lithium Network Conference) will take place at The Westin St. Francis in SF. I’ve been to 4 LiNCs while working at Lithium, but I’m most excited about this one. Aside from the fact that this is the first time I can relax and really enjoy LiNC (since I won’t be speaking); one of the main themes this year will be data/analytics. I just can’t wait to hear the keynote from Nate Silver—the renowned statistician who applies predictive analytics to everything from baseball to presidential election. Moreover, we have planned to share a preview of my next book The Science of Social 2!
Are you excited? Want to hear what Nate Silver has to say about predictive social analytics? You can still register for LiNC 2013, and I hope to see you there.
Alright, back to big data. So far we’ve discussed 2 classes of analytics for big data reduction—the process of extracting the few critical few bits of useful information from petabytes of raw big data. Since this article builds on the previous 2, you should familiarize yourself with the concepts in the following posts
- Big Data Reduction 1: Descriptive Analytics
- Big Data Reduction 2: Understanding Predictive Analytics
Today we will cover the last class of analytics for finding that needle of information in an ocean of big data—prescriptive analytics. Remember information << data—the information anyone can extract from big data will always be much less than the sheer volume of the big data itself. The difference is even more dramatic if we are talking about relevant and useful information.
Prescriptive Analytics: Guide
Prescriptive analytics not only predicts a possible future, it predicts multiple futures based on the decision maker’s actions. Therefore a prescriptive model is, by definition, also predictive. As such, it must be validated too. A prescriptive model can be viewed as a combination of multiple predictive models running in parallel, one for each possible input action. Since a prescriptive model is able to predict the possible consequences based on different choice of action, it can also recommend the best course of action for any pre-specified outcome. The goal of most prescriptive analytics is to guide the decision maker so the decisions he makes will ultimately lead to the target outcome.
In prescriptive analytics, we also build a predictive model of the data. This predictive model must have two more added components in order to be prescriptive:
- Actionable: The data consumers must be able to take actions based on the predicted outcome of the model
- Feedback System: The model must have a feedback system that tracks the adjusted outcome based on the action taken. This means the predictive model must be smart enough to learn the complex relationship between the user’s action and the adjusted outcome through the feedback data
Very few social analytics are prescriptive, because so few of them are predictive, and even fewer are actionable. Of the handfuls that are actionable, almost none of them track feedback data. There are platforms that use feedback data for tuning and improvement of the platform, but very few use such data for prescriptive purpose.
Going from Descriptive to Prescriptive
After the initial exploratory data analysis, sophisticated analytics is needed to extract information and derive actionable insights from data. Because most business decisions involve a choice among only a few options, useful analytics must reduce big data down to a few bits that we can decide and act on. There are three general classes of analytics for data reduction and decision support:
- Descriptive Analytics: Compute descriptive statistics to summarize the data. The majority of social analytics fall in this category
- Predictive Analytics: Build a statistical model that uses existing data to predict data that that we don’t have. Examples of predictive analytics include trend lines, influence scoring, sentiment analysis, etc.
- Prescriptive Analytics: Build a prescriptive model that uses not only the existing data, but also the action and feedback data to guide the decision maker to a desired outcome. Because prescriptive models must be actionable and have a feedback data stream, social analytics are rarely prescriptive
Descriptive analytics is simple, all we need is data. And there is no shortage of data in social media. For predictive analytics, you also need a properly validated model in addition to the data. Although most people focus on the model, the most important aspect of predictive model is actually the validation. Because anyone can build model, proper validation is the only way we know, with certainty, if the models work.
In order to do prescriptive analytics, not only will you need a rigorously validated predictive model, it must be actionable and have a feedback system that collects feedback data for each type of actions. This typically increases your already-pretty-big data volume by several orders of magnitude. Therefore, prescriptive analytics is generally very challenging unless you already have a scalable data infrastructure and the talent/expertise to make sense of the feedback data (e.g. sensitivity analysis, canonical correlation analysis, causal inference).
Conclusion
Social analytics is still in its infancy with most of it being descriptive, and almost no models are prescriptive. With modern advancement in machine learning, many interesting predictive models are now possible. We have no shortage of data. As big data technologies are commoditized, the accessibility to data will only increase. What we need is smart data analytics to distill petabytes of big data into actionable bits.
I have no doubt the distribution of social analytics will shift as this nascent field matures in the near future: from descriptive to predictive to prescriptive. Let’s imagine the future of analytics and have an open conversation about the possibilities.
In the meantime, I hope to see you at LiNC 2013.
Michael Wu, Ph.D. is Lithium's Chief Scientist. His research includes: deriving insights from big data, understanding the behavioral economics of gamification, engaging + finding true social media influencers, developing predictive + actionable social analytics algorithms, social CRM, and using cyber anthropology + social network analysis to unravel the collective dynamics of communities + social networks.
Michael was voted a 2010 Influential Leader by CRM Magazine for his work on predictive social analytics + its application to Social CRM. He's a blogger on Lithosphere, and you can follow him @mich8elwu or Google+.