Big Data Reduction 3: From Descriptive to Prescriptive

Lithium Alumni (Retired)


Welcome back! Let me just make a quick announcement before I dive into big data. As some of you might know, in 2 weeks,LiNC (Lithium Network Conference) will take place at The Westin St. Francis in SF. I’ve been to 4 LiNCs while working at Lithium, but I’m most excited about this one. Aside from the fact that this is the first time I can relax and really enjoy LiNC (since I won’t be speaking); one of the main themes this year will be data/analytics. I just can’t wait to hear the keynote from Nate Silver—the renowned statistician who applies predictive analytics to everything from baseball to presidential election. Moreover, we have planned to share a preview of my next book The Science of Social 2!


Are you excited? Want to hear what Nate Silver has to say about predictive social analytics? You can still register for LiNC 2013, and I hope to see you there.


Alright, back to big data. So far we’ve discussed 2 classes of analytics for big data reduction—the process of extracting the few critical few bits of useful information from petabytes of raw big data. Since this article builds on the previous 2, you should familiarize yourself with the concepts in the following posts

  1. Big Data Reduction 1: Descriptive Analytics
  2. Big Data Reduction 2: Understanding Predictive Analytics


Today we will cover the last class of analytics for finding that needle of information in an ocean of big data—prescriptive analytics. Remember information << data—the information anyone can extract from big data will always be much less than the sheer volume of the big data itself. The difference is even more dramatic if we are talking about relevant and useful information.


Prescriptive Analytics: Guide

Prescriptive analytics not only predicts a possible future, it predicts multiple futures based on the decision maker’s actions. Therefore a prescriptive model is, by definition, also predictive. As such, it must be validated too. guide to future.pngA prescriptive model can be viewed as a combination of multiple predictive models running in parallel, one for each possible input action. Since a prescriptive model is able to predict the possible consequences based on different choice of action, it can also recommend the best course of action for any pre-specified outcome. The goal of most prescriptive analytics is to guide the decision maker so the decisions he makes will ultimately lead to the target outcome.


In prescriptive analytics, we also build a predictive model of the data. This predictive model must have two more added components in order to be prescriptive:

  1. Actionable: The data consumers must be able to take actions based on the predicted outcome of the model
  2. Feedback System: The model must have a feedback system that tracks the adjusted outcome based on the action taken. This means the predictive model must be smart enough to learn the complex relationship between the user’s action and the adjusted outcome through the feedback data


Very few social analytics are prescriptive, because so few of them are predictive, and even fewer are actionable. Of the handfuls that are actionable, almost none of them track feedback data. There are platforms that use feedback data for tuning and improvement of the platform, but very few use such data for prescriptive purpose.


Going from Descriptive to Prescriptive

After the initial exploratory data analysis, sophisticated analytics is needed to extract information and derive actionable insights from data. Because most business decisions involve a choice among only a few options, useful analytics must reduce big data down to a few bits that we can decide and act on. There are three general classes of analytics for data reduction and decision support:


  1. bigdata feedback loop2b.pngDescriptive Analytics: Compute descriptive statistics to summarize the data. The majority of social analytics fall in this category
  2. Predictive Analytics: Build a statistical model that uses existing data to predict data that that we don’t have. Examples of predictive analytics include trend lines, influence scoring, sentiment analysis, etc.
  3. Prescriptive Analytics: Build a prescriptive model that uses not only the existing data, but also the action and feedback data to guide the decision maker to a desired outcome. Because prescriptive models must be actionable and have a feedback data stream, social analytics are rarely prescriptive


Descriptive analytics is simple, all we need is data. And there is no shortage of data in social media. For predictive analytics, you also need a properly validated model in addition to the data. Although most people focus on the model, the most important aspect of predictive model is actually the validation. Because anyone can build model, proper validation is the only way we know, with certainty, if the models work.


In order to do prescriptive analytics, not only will you need a rigorously validated predictive model, it must be actionable and have a feedback system that collects feedback data for each type of actions. This typically increases your already-pretty-big data volume by several orders of magnitude. Therefore, prescriptive analytics is generally very challenging unless you already have a scalable data infrastructure and the talent/expertise to make sense of the feedback data (e.g. sensitivity analysis, canonical correlation analysis, causal inference).



Social analytics is still in its infancy with most of it being descriptive, and almost no models are prescriptive. With modern advancement in machine learning, many interesting predictive models are now possible. We have no shortage of data. As big data technologies are commoditized, the accessibility to data will only increase. What we need is smart data analytics to distill petabytes of big data into actionable bits.


I have no doubt the distribution of social analytics will shift as this nascent field matures in the near future: from descriptive to predictive to prescriptive. Let’s imagine the future of analytics and have an open conversation about the possibilities.


In the meantime, I hope to see you at LiNC 2013.



Michael Wu, Ph.D.mwu_whiteKangolHat_blog.jpg is CRM2010MKTAWRD_influentials.pngLithium's Chief Scientist. His research includes: deriving insights from big data, understanding the behavioral economics of gamification, engaging + finding true social media influencers, developing predictive + actionable social analytics algorithms, social CRM, and using cyber anthropology + social network analysis to unravel the collective dynamics of communities + social networks.


Michael was voted a 2010 Influential Leader by CRM Magazine for his work on predictive social analytics + its application to Social CRM. He's a blogger on Lithosphere, and you can follow him @mich8elwu or Google+.

Not applicable

I just barely think that predictive analytics can go beyond fairly simple inferences based on geo-location. The concept of prescriptive analytics seems to imply that you think analytics can become an automated system, albeit with feedback loops, to guide action by a client (organization). I find this whole concept difficult to swallow as a person with a PhD in social science. People are predictable until the point that something significant changes in their daily lives and, at that point, they are no longer.

Lithium Alumni (Retired)

Hello Larry,


Thank you for commenting.


First of all, predictive analytics can go way beyond geo-location. Have you read the previous installment in this data reduction mini-series? There are a lot of very interesting emergent predictive analytics. In my previous post, the 2 non-trivial examples I gave was influence scoring and sentiment analysis. But there are many more. If it is of interest, maybe I will write a post summarizing some of the more interesting predictive analytics I've seen. Let me know...


Second, prescriptive analytics does is not the same as automation. Remember the function of prescriptive analytics is to guide decision making. Not to make the decision for us. Ultimately, human judgment is required. Moreover, what recommendation you get will depend a lot on how well you can specify the desired outcome.


But you are right in that individual behaviors are very difficult to predict. But collective behaviors are still very predictable. And for business, we rarely rely on single user behavior to monetize, especially for consumer brands. After all, prescriptive analytics is not new to business intelligence systems. They've been around quite a bit.


Back to the individual vs. collective behavior. Even if we predict barely above chance (say only 50.1% of the population correct and 49.9% wrong), if the population is large enough, that small differential predictability can still result in a huge gain for the business. That is in fact what people do in Wall Street, even though some of them are not doing such a great job now.


Moreover, the feedback loop allow you to adapt with new data as people change. Again, if 1 person change, that is not a concern for the prescriptive system. Until a large enough population change, and change in the same way that it affects the outcome, then the data would have inform the decision maker that something has change. Then it is the decision maker's choice whether s/he like to change his course of action.


Alright, I hope this address some of your concern. If you still have question, I'm happy to discuss and dig deeper. That is how we learn and improve.


See you next time.


Not applicable

There are some well tested industry definitions for the "analytics value chain" from "descriptive" to "predictived" to "prescriptive" which make more sense, I believe. Note also that the term "Prescriptive Analtyics" is a (tm) registered by someone already doing years of deep research on this - and already delivering technological solutions that work.  Probably some historical context would be good.


For me, the factor missing here is "time" as in "real-time".   As Rafiki in The Lion King said: "What does it matter, it is in the past". As we proliferate Big Data, the analytics of greatest interest, are on the most recent data. Relevence is a function of "time," and especially so for Social Analytics.


In the end, as data from social interactions ages (gets older), it's no longer relevent.  Hence, the concept of "prescriptive analytics" for "social" are going to be processed in-real time on real-time streams of data.  These are still HUGE in size and scope, but usurp anyt meaningful interest in large volumes of historical data.  I suggest we lessen our use of "extracted from" as that implies collecting, storing, managing data..., which is of less interest than knowing what is going on "right now."


The big analytics companies are wrestling with this now. As the business community changes instantly, pivot strategies are no longer something that can be done in October's annual budget process. Hence prescriptive analytics for strategy are rapidly shifting to sensitivity analysis on real-time streams of data from social content. Businesses that rely on the last five years of historical data to make strategy decisions for the immediate or mid-term history are likely to follow the US auto industry circa 2007/8.


Industries that will really benefit from this include Healthcare (I want the best medical algorithmic solutions in my diagnosis, than what was done last year with my current, and dynamically changing symptoms that can be processed in real-time with advanced sensor technology.


Another will be Resources and Energy - from Upstream to Downstream , to Retail. Think "distribution, "the grid," and "smart meters."  The potential for efficiency here is immense.


Another will be government and defense. One of the last bastions of innovation adoption is in government. It will take this kind of shift - for government and legislation to know what the public is thinking, and deliver on represntation in a way we can only imagine.  Defense is already going this way.


Last point to drive the real-time concept here is that everything, and I mean EVERYTHING will have a sensor on it. or in it. From your pet, to your car to your kid's lunch box.  Let your mind wander as you ponder this one.  Check out #IndustrialInternet from GE - this is BIG.

Not applicable

Hey Michael,


Yes, I see your point about prescription not requiring automation. However, I think you need to be clear to your readers WHY it is only a guide (do you mean heuristic?) and not a "rule to follow". The distinction between the two has been a critical one for the last 30 years in AI and all its derivatives. Either you build the logic into a system, and it becomes automated, or you leave it as advice about the most likely decisions leading to an explicit outcome, and it remains contingent on human judgment. If the latter is your point then predictive analytics is sufficient for the decision-making involved.


Someday perhaps I can take the time to engage you on the discussion of influence and sentiment analysis that you offered here in the past. Like many in the "hard" sciences who turn their attention to social things, and here I put you in the same basket with Duncan Watts whose work I truely respect, you do seem to forget that human beings take action socially, and that they in fact take meaningful action when they engage in activity on social networks. Social meaning works at the group and individual level. Otherwise, we wouldn't have need of subjective probability.






Lithium Alumni (Retired)

Hellow Andrew,


Thx for the comment. 


I apologize that I am not aware of the trademark for prescriptive analytics. I was referring to the general class of analytics that prescribes possible course of actions, rather than any proprietary system.


Concerning the real time issue. That should be factored into all analytics (regardless of whether they are descriptive, predictive, or prescriptive) depending on what kind of problem you are trying to solve. Some problems do requires real time data, and others do not. To be realistic, nothing is truly real-time. There is always some delay. Is 1 minute real time enough? is 10 sec delay real time enough? how about 1 second? 


A problem that requires real time data is like location based recommendation of restaurants when you are traveling to a new place you never been to before. But if you are trying to predict what books I would buy next, then you don't really need real time data. Amazon's recommendation system is a good example of that. You can use historical data to get a pretty good (or at least good enough) recommendation. You probably don't want to use data that is more than 5 years old, b/c people's reading interest do change, but they don't change from day to day, it's more like on the order of many months to years.


In my oppinion, real time is somewhat overrated. It does, however, represent a class of inference that we cannot perform previously. So it is rather new, and as a result, there is much hype with this new tool as with anything else. But as big data technology are married to sophisticated analytics and machine learning, near real-time might become possible and may be even commoditized one day. Then real-time vs long historical analysis will be on an even plain. Then people will start to realize that there are value and place for both. Wikipedia is a great example for knowledge that doesn't change much.


You are right that, there are definitely plenty of great use cases in health care, government, and defense. And definitely the internet of things will add a whole new dimension to data, b/c interaction not only can happen between humans, but between human and things, and among things.


Thx for the interesting discussion. 

These are great points you raised. And I hope to see you again next time.

Lithium Alumni (Retired)

Hello Larry,


Thank you for continuing the conversation.


And I appreciate the criticality. I should have some examples to clarify why prescriptive analytics should only guide decision making. The answer is really that the outcome is typically consist of many variables, that human still need to make the tradeoff. So let me try to illustrate what I meant with an example.


Say you have a prescriptive analytics system that takes social data and predicts some outcome based on certain actions you take to address your customers. You may be provided with 2 courses of action, and each one may lead to a series of outcome variables. For example:


  1. Action 1 have a predicted outcome of: increase share of wallet by 20%, but lower customer loyalty (say increase churn by 5%)
  2. Action 2 have a predicted outcome of: reduces the share of wallet by 20%, but will increase the frequency of purchase by 10%

So you, as a decision maker, would have to decide whether the increase churn of 5% is worth the 20% increase in share of wallet. In some industries the answer is definitely, but in industries where loyalty is very important and CLV is very high, then it may not be a good tradeoff. Blindly setting up rules for these tradeoff is usually not advisable. So prescriptive analytics gives you a view of the possible outcome based on your choice of action. But you have to still choose what actions to take.


I hope this address your concern.


Thank you again for your nice comment and think so highly of me. I appreciate your interest in my work. I do recognize that people take action socially. Sometimes people may do thing for no particular reason at all. That is why we are not trying to predict individual action. Rather we predict collective actions. The same can be said about any sociological principles as well. Homophily, how true is it? Is it true at the individual level? Do everyone of my friend really more similar to me socioeconomically than to someone I don't know. Probably not. But at the population level, there is a higher probability that is true. Same with social phenomena like triadic closure, etc. They are predictions about the collective behavior, not individual behaviors.


Anyway, thx for the discussion. 

I do hope that I will have the opportunities to engage you sometimes.

See you again on Lithosphere.


Not applicable

Thanks for the reply Michael.


I've always been one to see the magic of thinking big, VERY BIG. 


I would say that amazon, and to some degree, I know they are already working on this, are focused on real time for precisely the reasons you indicate are NOT important.  Today, following the NetFiix approach, Amazon uses "Descriptive" analytics to process what you have purchased in the past, to predict what books, videos, and things you might like in the future following patterns of other like individuals. (this is your vision of Amazon).  


An example of my vision of Amazon's future will be to "sense" when one goes to the movies, identifies the particular movie you are watching (there are multiple ways to do this already - but suffice it to say that your smart phone can be located to with in 5 meters, or less, depending on technology used) Hence, if Amazon knows "real time data about YOU", they can offer you something relevant to your immediate experience. In the example of the movie, they can offer you the opportunity to purchase the book of that movie, pre-order the DVD when it comes out, send you to a restuarant of the genre that matches the movie you "just experienced"..., and more. This "prescriptive analytics" approach is far more powerful than the NetFlix "like others" descriptive approach.  Amazon plans to "prescribe" things that you might like based on your real-time experience.  


FWIW, Groupon will do this too, they have the right people there (some ex Amazon executives...)


Four years ago I published an article  about "Position and Movement Analytics" a better description of "location analytics" or LBS as it takes directly into account the dynamics of movement, which is relative to "time"..., 


Real Time, is, of course, relative to the problem you are looking at. It could be the last 30 seconds, as in healthcare when a tool is telling a physician about your heart's arythmia changes at that moment.  Or in the Amazon example, it could be 2 hours, the length of the movie you are watching. The faster we can get the "experts" off of their legacy cost-focus of Big Data, storage, management, access..., and so on, and on to building solutions that create value.... Well, that's the future we all want!


In any event, I don't see "time" at all related to "descriptive analytics". That is simple processing and dashboarding of history. History is static, it does't change.  Predictive has a factor of time - refering to scenario possibilitiies in the future. But Prescriptive is ALL based on time. 


My son studying actuarial science at the university level already has learned to apply simple calculus and limit theory to these cases I describe. the next generation is going to do some very interesting things pushing the limits of Prescriptive much farther.  


Gotta' think much BIGGER..., I believe. 

Lithium Alumni (Retired)

Hello Andrew,


Thanks for continuing the conversation.


I don’t mind thinking big. And I do think big. I just have to be careful who I talk to when I’m thinking big. To me, there is thinking big realistically, and thinking big day dreaming.


If you like, I invite you to take a look at a piece I wrote for Wharton Future of Advertising. That is one example of my way of thinking big realistically.


Thinking big day dreaming would be anything and everything is possible in the future.


Not only can analytics systems prescribe in real time, it can actually do it faster than real time by predicting what will happen in the future, Moreover, with enough data from bio-sensors on the mobile device + computing power, machines will be able to anticipate my actions before I actually take any action. So it would know even the choices I make before I make them. That basically allow everyone to map out his entire life and prescribing every action he need to take in his life as soon as he is born. You might start to think that I’m just day dreaming…


Alright maybe that is too big, so let me tell you how this can happen more realistically. How about we give someone a mobile device that can communicate with implanted neural and biochemical sensors transmitting signal the mobile device at birth. Then the machine can learn all the neural and biochemical patterns as he grow up and learn all the choice he make, learn his taste and decision making pattern. Then when he’s 18, he can say, “I want to retire in France countryside, have my own vineyard, and make my own wine when I’m 68 years old.” Then the machine will map out all the actions he need to take in order to achieve his goal. After all, the machine had 18 year to learn about how this person work at the neural and bio-chemical level. Moreover, as he takes these actions, the machine continue to update the possible future mapping out the next action he need to take to best achieve the life he want at 68. It’s definitely not impossible. With the internet of things and ubiquitous sensor networks, this not too far fetch.


That said, I totally agree that real time offers a lot of opportunities, b/c it’s something that we can’t do before. But give it 10 years or so after real time prescriptive analytics is made popular. Then it will just be another tool. And there are problem that need real time and there are other problems that need long history. Which analytics people chose will fall back to what they are trying to achieve, rather than going to the new shiny toys.


Same for this life mapper prescriptive system too. Even if it exist 10 year from onw, after 20 year, it will just be a tool, and there will still be problems that will use simple plain of descriptive analytics. Right now prescritive problems are probably less than 1%, so we see a huge opportunities. But once it is commoditized, the type of problem that use descriptive, predictive and prescriptive analytics will probably be pretty even.


Yup, the next generation will always out do us, b/c they have all the knowledge and technologies we created. No doubt on that.


Thanks for the interesting conversation.

Most of my physicist and mathematician friends are more of a realist. So thank you for prompting me to think bigger.


Hope to see you again on lithosphere.


Not applicable

hi Michael,


Now that is the kind of big thinking I like.  And, I fully understand your point about audience.


Next, we must discuss, unexpected, and unpredictable events, They will always exist, and have an effect on outcome.  In your example, planning out living in the south of France, if the subject were to be belessed with a child, or experience an accident, for example, things could change the outcome.  Or, one could serendipitously change their mind.


Predicting nature's course is something we can approximate, but I do believe we're a good bit away connecting the human genome, neuroscience, and ultimate final behavior. 


Two individuals could plan out that they will win the Olympic 400m relay. But this depends on others team members, and individual human performance, weather, and other environmental factors - and thus can't fully be prescribed analytically.  And, of course, there is the case where there is one outcome (winning a foot race at the Olympics) that is shared by two individuals. The computations will happen, but only one will win.


I like the big examples of prescriptive, with respect to decision making, social, marketing as strategy (a la Kumar), and business. The distribution curve continues to live and thrive there, and prove  we can create value through prescriptive analytics.


All the best - 

Andrew Stein

Lithium Alumni (Retired)

Hello Andrew,


Thx for the discussion.


If the feedback system records these unexpected events, then there is no need to model and predict them. There are several reasons for that.


First is simple… Unexpected events are by definition unpredictable. If they are predictable, then they are not unexpected anymore.  ;)


Second, if the feedback system records these events continuously with the rest of the data stream from the person’s life, then it pretty much works like a GPS. Think of these unexpected events as missing a turn or missing an exist. Your GPS can’t predict that. But once that happens, your GPS can detect that is something unexpected and significantly takes you off course, then it re-routes it to the destination from where you are.


So as long as my goal didn’t change, then the predictive system will keep on working on the background prescribing me the best course of actions to take me closer to my goal.


Of course, if I change my objective or goal, that is a different story. I will have to tell the system that I no longer want to retire in French country side when I am 68, rather I like to just retire in California with my families when I am 65. Then the prescriptive algorithms can automatically updates the courses of actions to best direct me to the new goals.


These ambient sensing systems that continuously collect data about our environment is very powerful.


Great discussion.

See you again next time.