Khoros Atlas Logo

Big Data is Great, but What You Need is $mart Data

Lithium Alumni (Retired) Lithium Alumni (Retired)
Lithium Alumni (Retired)

Michael Wu, Ph.D.mwu_whiteKangolHat_blog.jpg is 927iC9C1FD6224627807Lithium's Principal Scientist of Analytics, digging into the complex dynamics of social interaction and group behavior in online communities and social networks.


Michael was voted a 2010 Influential Leader by CRM Magazine for his work on predictive social analytics and its application to Social CRM.He's a regular blogger on the Lithosphere's Building Community blog and previously wrote in the Analytic Science blog. You can follow him on Twitter or Google+.



A little more than a week ago, I co-presented a session with Mike Fauscette from IDC on “Big Data” at our annual Lithium Network Conference (LiNC). Mike kicked off the session by laying out a very nice framework for understanding data systems. These systems are evolving from a system of record for transactions (system of transactions) to a system of analytics for decision support (system of decisions). But we also need a system of relationships to provide the context in order to make the decisions relevant.


Then I took a swipe at big data—because big data is hyped, and it’s overrated. Although data is the enabler for many interesting analytics and it’s a fertile ground for valuable information and insights, it takes quite a bit of work to extract the information from the data and discover the insights.


As an illustration, I thought it would be nice to show you an example of what big data might look like and contrast that to what I call $mart Data. So I’ve created an infographic that illustrates the difference between the two.


In page 1, you will see the raw, unprocessed, big data that I retrieved from twitter’s search API. That is what big data looks like (or a tiny snippet of it of it).



(Click to see Page 1 of the hi-res Infographic)


Now, ready for the surprise? In terms of information, page 2 contains exactly the same information as page 1, on the data on page 2 has been analyzed. Everything you see on page 2 (the $mart data), is derived from the big data on page 1. I simply presented the data in a different way—more relevant and actionable. Moreover, it’s presented in an intuitive way. Once you read the headline, you can pretty quickly understand what the data shows.



(Click to see Page 2 of the hi-res Infographic)


As I described at LiNC, $mart data is useful and digestible:

  1. Useful = relevant and actionable
  2. Digestible = intuitive and interactive

Too bad, this infographic is an image, and it’s static! Ideally, you should be able to interact with $mart data and get a deeper understanding and more personal appreciation of what the data is trying to tell you.


In a later blog post, I will describe in greater detail precisely what each of those terms means. But for now, I just want to show you and hope you can get a sense of the difference between Big Data and $mart Data. Can you see the difference? Although big data is a key enabler, $mart data is really what you want at the end of the day. Because it is the $mart data (not the big data) that is going to help you make smarter decisions.


A Confession from a Scientist

Finally, I have a confession to make. I know it’s been a while since I blogged. And there are several really good reasons contributed to this.

1. Earlier this year, I been looped into the product and engineering organization and become much more involved with product design as well as the backend architecture for our data and analytic system. Rather than sitting quietly in the corner and think, play with data, write, etc, I often get dragged into endless number of mind numbing meeting, which eats up most of my week. Although the actual number of meeting hour isn’t much compare to most managers, the interruption is definitely a hindrance to the more creative part of my work


2. I’ve just been on the road a lot. If you got a recent out-of-office auto-reply from me, you will get a sense of how bad my schedule is recently

2012-03-07                     CSI -Search & Content Findability, Daly City, CA

2012-03-11 – 03-13        Deloitte -On Social Insight+Data, Westlake, TX

2012-03-14 – 03-16        Partner -Science of Gamification, WWW

2012-03-20 – 03-22        CSI -Swarming Reputation Model, Reston, VA

2012-03-29 – 03-30        SocialTech -Gamification for B2B, Seattle, WA


2012-04-14                     TEDxSJCA -Pay It Forward, San Jose, CA

2012-04-17 – 04-20        Rotman -Exec sCRM Lecture Sereis, Toronto, Canada

2012-04-23 – 04-25        SugarCon -Science of Relationship, SF, CA


2012-05-01                     VatorTV -Gamification for Entrepreneur, Berkeley, CA

2012-05-02 – 05-04        LiNC -Big Data Big Deal, SF, CA

2012-05-16                     Stanford -Mobile Health, Stanford, CA

2012-05-17                     MMSS -SoLoMoCo Context Inference, SF, CA

2012-05-20 – 05-21        Partner -Community CoCreation, Boston, MA

2012-05-30 – 06-03        IRF -Gamify Travel & Hospitality, San Antonio, TX


2012-06-04 – 06-09        Lithium -French SOS Launch, Paris, France

2012-06-17 – 06-21        e2.0 -Collaboration Experiment, Boston, MA

2012-06-20 – 06/22        MAE -Scaling Social CX, Shanghai, China

2012-06-27 – 06-30        Yumemi -Mobile Gamification, Tokyo, Japan

2012-06-30 – 07-08        PTO -Vacation

Again, this may not look so bad if you are a sales, or biz dev, but I’m a scientist who works with the product and engineering organizations. And I do have responsibilities there too.

3. Lastly, due to LiNC (which took place little more than a week ago), as well as the Lithosphere had face lifts. Consequently, my blog disappeared and its content has been shuffled around. And, we were all busy preparing for LiNC

So, I hope you can understand why my digital footprints seem to have disappeared from social media all of a sudden. I barely have time to eat and sleep, so I hope you will forgive me for not tweeting and writing as much as I used to. I hope I can fix this problem soon. And I am going to need all the help and support I can get from all of you.


Alright, thank you and I will see you again soon with more $mart data blogs. Stay tuned!

About the Author
Dr. Michael Wu was the Chief Scientist at Lithium Technologies from 2008 until 2018, where he applied data-driven methodologies to investigate and understand the social web. Michael developed many predictive social analytics with actionable insights. His R&D work won him the recognition as a 2010 Influential Leader by CRM Magazine. His insights are made accessible through “The Science of Social,” and “The Science of Social 2”—two easy-reading e-books for business audience. Prior to industry, Michael received his Ph.D. from UC Berkeley’s Biophysics program, where he also received his triple major undergraduate degree in Applied Math, Physics, and Molecular & Cell Biology.
Chris Hutchins
Not applicable

Very interesting post, as are almost every one you do.  Perhaps you will be showing how you produced fig. 2.  I have a program that I can produce the word cloud, but many of the other graphics seem to require manual filtering and processing to produce.  Do you have this process automated at all?

Lithium Alumni (Retired) Lithium Alumni (Retired)
Lithium Alumni (Retired)

Hello Chris,


Thank you for the nice comment. 


Yes, in subsequent posts I will describe more about how I analyze the data and produce the graphics. I use different tools. Depends on what data it is and what interesting things I find, I use different methods of visualization.


Stay tuned for more $mart data analytics!


David Glow
Not applicable

Thank you for such a smart, insightful, and approachable post. I am trying to venture into big data.  I work in corporate training and our team tends to attempt to collect volumes of data as if that is what delivers the value, but we aren't spending enough time deeply analyzing the data to provide insights to help our employees perform better. I think this post will help me clearly communicate the difference more effectively to my peers than my prior efforts.


Do you have any suggested resources for folks venturing into this area to start down the path of understanding how to go about analyzing larger volumes of data and extracting the valuable insights embedded?


Thanks for any assistance. Will be following your posts with high interest!

Not applicable
I think i'm missing the point of this article. You claim that big data is hyped and overrated, but isn't it necessary to produce the results that you are showing? Also, aren't those just infographics and fairly basic statistics? do we really a new term for something that is fairly ubiquitous and has been around forever? If you're trying to avoid the world of the hyped, overrated, and misunderstood, then isn't introducing some new term, $ included, contrary to that goal? You seem to be arguing against people who would say that "the great thing about big data is that it requires no analysis, one can just look at terabytes of raw data and completely understand it." But has anyone ever said that?
Lithium Alumni (Retired) Lithium Alumni (Retired)
Lithium Alumni (Retired)

Hello David,


Thank you for the comment and glad you find it useful.


It is true that many companies are still in the data collection mode, and not really invest in the analysis of it. But that will change as more and more data is being collected. If you invested $10M in collecting the data and find that you don't have anyone to analyze it, wouldn't you invest a little more to hire the right people to analyze the data? This is one of those behavioral economics phenomenon or predictably irrational behavior that people keep repeating.


I just hope more companies are investing in the analysis sooner. Big data is great, it is a critical enabler, but not the ultimate solution.


In terms of resources, one of the reason I've decided to write this mini-series is because I feel that there isn't enough good resources out there on the analytics of big data. Most of the big data stuff you find are centered around the technology. 


Thank you for your interest though. More post coming on this analytics science mini-series.

 OK, I hope to see you next time.

Lithium Alumni (Retired) Lithium Alumni (Retired)
Lithium Alumni (Retired)

Hello Jordan,


Thank you for commenting and challenging my thesis.


I think something is missing and did not transpire in such a short blog. It is true that all the statistics and the infographic I produced in page 2 is derived from the tiny fraction of the big data I've collected. And I said that in my blog post too.


There is no doubt that big data is important. However, the hype is that many less-technically savvy individuals in the business world believes that big data is the solution that will help them make better decision. As you said, even though no one said "the great thing about big data is that it requires no analysis, one can just look at terabytes of raw data and completely understand it," lots of businesses believe they are the final solution. You and I, and data savvy people are the outliers rather than the norm.


The point that I want to make is very simple: Big data is a critical enabler and it is very important, but it is not the final solution with respect to the decision support needs of an organization.


The $mart data is a by-product of the LiNC conference. Since the infographic is about the conference, I kept it that way.


In later posts, I will describe what I meant by Smart Data? It isn't infographic that is not the point. That's why I call the infographic an infographic. Nothing more. But in short, Smart Data is a set of design principles that we use to design our analytics product. It involves how we apply the analytics, how the user access it, and how the result is presented. It is not data perse, but it is a design principle around data. Because traditional business intelligence (BI) are designed for business analyst, so you can think of Smart Data as an extension of BI. The goal is that it should be intuitive enough, so that even the non-analyst can access, process, explore, and understand complex social data (like an infographic). But it is not an infographic.


Alright, I hope I've address your problem.

Otherwise, I'm happy to discuss further here.


See you next time.

Not applicable
Very interedting post in which tool you created the infographic?
Lithium Alumni (Retired) Lithium Alumni (Retired)
Lithium Alumni (Retired)

Hello tomer,


Thanks for the comment and glad you find this post interesting.


The calculations are all done in R and Matlab. And all the graphs are drawn in Microsoft Excel. Then I created this infographic with Adobe Illustrator CS5 pretty much from scratch. There is really no shortcut or magic to creating great infographics. You really need professional tools like Illustrator and some interesting information to show.


In subsequent post, I will describe how I follow the data and come up with this infographic. 

I hope to see you next time. Have a great weekend.