Khoros Atlas Logo


What does Lithium + Klout Means to Me (Part 1—Today)

Lithium Alumni (Retired) Lithium Alumni (Retired)
Lithium Alumni (Retired)

Klout+Lithium.pngFirst of all, to those who pinged me privately about this deal, now you know why I didn’t respond. Although I was eager to talk about it, I was under NDA. I am sure you understand. But now that it’s public knowledge, I’m excited to tell you how I feel about it. Today, I will focus on how I feel about this acquisition in general. In my follow up blog, I’ll give you my thoughts on the possibilities when you combine Lithium + Klout.


As you can imagine, I have strong feelings about this acquisition, because I am deeply passionate about the subject of consumer influence on social media. I’ve researched and written much about this subject in an attempt to advance our understanding and move the industry forward. So the Lithium + Klout deal is really exciting for me. Not only will I have a more direct contribution to advancing the science of influence, I get to delve into a boat load of new consumer behavior data.


Klout has over 500 million user profiles across the most popular social platforms (e.g. twitter, facebook, google+, etc.). They’ve collected all kinds of user actions and interactions—from retweets, favorites, +1, comments, replies, even bing search queries. To a data scientist, this is not just your ordinary data playground; it’s a data Disneyland. With this rich data set, we can answer more fundamental questions and understand consumer behaviors at a much deeper and more meaningful level—beyond just how to monetize them.


So how do I feel? I’m very excited.


A Data Scientist’s Perspective of the Deal

logo dr wu2.pngBeyond my personal enthusiasm, I feel that this is a smart move for Lithium as well. I can tell you 3 good reasons from the narrow spectacles of a data scientist.

1. Klout turbocharges our big data infrastructure


Lithium has invested heavily in building our own big data infrastructure, but we are traditionally an enterprise social platform. That means our technology infrastructures need to handle many things—real-time interactivity, security compliance, performance/reliability, etc.—beyond data processing. Despite our sizable big data investment, it’s been an incremental effort (i.e. like a ramp) rather than a big bang approach (i.e. like a step function).


Klout, on the other hand, is purely a data company. Their entire business and product offering is built on the processing of consumer social data. Their infrastructure is designed and built with the sole purpose to capture, ingest, process, store, and retrieve consumer behavior data. At the scale of 500M users, even when users only interact 10 times a day on average—an overly conservative estimate—across any one of the social platforms they tracked, that is 5B interactions per day. In fact, they are processing about 15B interactions daily. So, in terms of raw data processing capability, I must say that they are ahead of us, and I’m very impressed with their big data infrastructure.


This augments our data processing capability and allows us to leapfrog our incremental approach to building our big data infrastructure.


2. Klout boosts our data science talent


data scientist girl.pngAs I’ve written in an AdMap article, the most expensive part of any big data initiative is the talent—the human resources—required to perform the analysis and machine learning on your big data. This is partly due to the shortage of data science talent in the job market now. If you’ve been trying to hire data scientists, you’ll understand how fierce the competition is. This is exacerbated by the fact that Lithium is a b2b white-label brand that many fresh talents out of college won’t know about. Every so often I still get chemistry or biology students coming to our college recruiting events thinking we do something entirely different!


With the acquisition, we not only get Klout’s state-of-the-art data platform, we get the talented people who designed and built it. This greatly increases our data science talent several folds. Now, I have more peers, with whom I can discuss how to use boosting, bagging, or random forest to increase prediction accuracy, how to use Markov chain Monte Carlo method to estimate the integral of people’s behavior distribution, etc. So with the addition of Klout, we kill two birds with one stone—we get the infrastructure and the talent.


3. Klout is a consumer brand that people care about


Despite the controversies about Klout as a standard for influence, it is an algorithm that tries to score consumers’ social interaction—social capital. Although the algorithm has lot of room for improvement in my opinion, it is a decent first attempt at its scale. However, an accurate influence scoring algorithm must be adaptive anyway. So even if Klout has the perfect algorithm now, it will need to change in the future in order to adapt to consumers’ behaviors. So having a less-than-perfect algorithm is not really a concern for me. What’s important to me is that they are able to change their algorithms and re-process their data quickly, and their big data infrastructure certainly provides this capability. It would’ve been more alarming to me if they had the perfect algorithm today, but it’s completely hard coded with little flexibility to change, adapt, and re-process historical data.


Still, the most important thing is that consumers are using Klout. Regardless of the controversy of the influence score, it’s a good-enough first attempt for consumers. In addition to the 500M profiles Klout has, they are adding ~450,000 new profiles every day. Why would consumers adopt this service? First, it’s simple—simplicity drives adoption. There is virtually no additional work other than signing up and opting in the network you want to Klout to track. It’s personally relevant, because it’s a score about you—whether it truly represents your influence or not. Consequently, people care about it, for curiosity, vanity, or other reasons. Even those who recognize the score is incomplete will look at it occasionally. Yet most of them won’t bother to opt-out because it doesn’t cost anything.


This is critically important, because it means that Klout will continue to get more and more data about consumers. After all, what good is it in having the big data infrastructure and talent if there is no new data to store, process and analyze? Without the consumers who are generating the data, we will never be able to extract any reliable information or uncover any valuable insights that help brands better connect with their customers.


data-triangle-2.pngThe magic is the combination of all 3—the data, the infrastructure, and the talent. When these 3 ingredients are combined, you can create an engine that turns consumer data into consumer insight, which is what brands want. Add those to our team at Lithium, and the opportunities are endless.



I couldn’t be more pumped about Klout joining Lithium, but not because of its score. Many people think of Klout as only a score (or an algorithm), but that’s just the final product. And when the final product isn’t good enough, they naturally conclude that nothing is, because the final product is all they see.


But if you are a data scientist, like me, you’ll understand that behind that number in an orange box, there are many ingredients that enabled the production of that final number. And to me, 3 of the most crucial enablers are

  1. The consumers who are willing to opt-in their data
  2. An agile, flexible, and scalable big data infrastructure (both hardware and software)
  3. The data science talent who can use the infrastructure to derive insights from consumer data


Although the final product—the score—isn’t perfect, Klout has all the ingredients and the 3 critical enablers. This means we can still improve the final product, and we will. But what’s more exciting to me is that we can actually leverage these enablers creatively to build completely new data products altogether.


That’s my two cents for today. Stay tuned for the next blog where I will elaborate on some of the new opportunities with the combined expertise of Lithium + Klout. Thanks for reading.



Michael Wu, Ph.D.mwu_whiteKangolHat_blog.jpg is 927iC9C1FD6224627807Lithium's Chief Scientist. His research includes: deriving insights from big data, understanding the behavioral economics of gamification, engaging + finding true social media influencers, developing predictive + actionable social analytics algorithms, social CRM, and using cyber anthropology + social network analysis to unravel the collective dynamics of communities + social networks.


Michael was voted a 2010 Influential Leader by CRM Magazine for his work on predictive social analytics + its application to Social CRM. He's a blogger on Lithosphere, and you can follow him @mich8elwu or Google+.

About the Author
Dr. Michael Wu was the Chief Scientist at Lithium Technologies from 2008 until 2018, where he applied data-driven methodologies to investigate and understand the social web. Michael developed many predictive social analytics with actionable insights. His R&D work won him the recognition as a 2010 Influential Leader by CRM Magazine. His insights are made accessible through “The Science of Social,” and “The Science of Social 2”—two easy-reading e-books for business audience. Prior to industry, Michael received his Ph.D. from UC Berkeley’s Biophysics program, where he also received his triple major undergraduate degree in Applied Math, Physics, and Molecular & Cell Biology.
Khoros Staff
Khoros Staff

A very thoughtful post here, Michael.  Hope we can connect soon in person to talk more.  Let's get you back out to the Austin office!

Frequent Advisor
Frequent Advisor

I am very excited about this and I look forward to seeing how Lithium evolves with the acquisition of another great company!

Lithium Alumni (Retired) Lithium Alumni (Retired)
Lithium Alumni (Retired)

Hello Bryan,


Thank you for the nice comment. 

I'm sure I will have some opportunities to visit Austin again soon.


Looking forward to the discussion.


Lithium Alumni (Retired) Lithium Alumni (Retired)
Lithium Alumni (Retired)

Hello cs1991,


Thx for commenting.


Yeah, I'm sure we will all evolve towards something great. We may not get there tomorrow, but we will head towards the right direction. And we will need all your help and support to get there.


Lithium Alumni (Retired) Lithium Alumni (Retired)
Lithium Alumni (Retired)

The data we will be able to provide is the holy grail of modern day consumer oriented businesses. Hell, even b2b. This deal just locked us into being one of the largest, most influential companies in the world. I am surprised competitors didn't try harder to wrangle it from our hands. Maybe they just didn't see it. Maybe we caught them by surprise. Maybe both. But very very savvy decision on the part of everyone involved.

I can't believe what just happened... we just gained the best position of anyone in our industry. 

Lithium Alumni (Retired) Lithium Alumni (Retired)
Lithium Alumni (Retired)

Hello Justin,


Thx for the comment.


Yeah, it's exciting isn't it. That is the ingenuity of our senior leadership team, and many months of hardwork of the due diligence team. I actually didn't do much. They deserve all the credit.


New Commentator

Hummm interesting how ones perspective can change to make things convenient for themselves.  


Klout is officially the Kardashian sister of social influence startups. Without showing any demonstrable talent, the company has been acquired by Lithium Technologies for a cash and stock deal valued at almost $200 million, reports Fortune.

Even Lithium's own chief scientist, Dr. Michael Wu, expressed doubts about Klout and its ilk, as Buzzfeed reported last September:


But for companies like Klout, the window to become the trusted industry standard might be narrow. "I don't think these companies have enough computing infrastructure in place," Wu said. "But as these companies get bigger and the demand for these metrics grows, they'll need to add it quickly. If they don't, people will start to believe that influence measurement is meaningless and there will be less money coming their way."

Lithium, which provides "social customer experience solutions" for businesses, expects to IPO soon because it's startups all the way down, folks. The industry that redefined failure as a metric of honor is soft landing itself into a handful of corporations.

Lithium Alumni (Retired) Lithium Alumni (Retired)
Lithium Alumni (Retired)

Interesting perspective @thehackjob - I don't think @MikeW has changed his opinion to suit anyone.  This blog recognizes the value but also that things can still be improved.  Stay tuned for Part 2 if you want to know more on what the future holds.

Occasional Commentator
Occasional Commentator

Hi Michael. 


I have to admit, I have been quite vocal in my dislike and mistrust of Klout. The trouble is for me, it is taken far too seriously in the industries I occupy (gamification, social and technology). I started to lose all faith when they had the big algorythm change and peoples scores all dramatically changed. At that point I began to wonder how you can put any weight on a number that can be changed in a way you have no control of. When you hear of things like customer service and even CVs being screened based on Klout score, and that score is not stable or in your control - it is hard to trust.


So, my question is - how does joining up with lithium help the consumer. You now have access to more data than most, how will you now use that in a way that benefits me? 


I am happy that Lithium are involved, I am a bit fan of your work and insights into influence, but the concern is still - what's in it for me? 

Lithium Alumni (Retired) Lithium Alumni (Retired)
Lithium Alumni (Retired)

Hello @thehackjob,


First, thank you for taking the time to comment here. And Thank you @DayleH for the defense as well.


I remember those quotes very clearly. I was in New York, and I had a great phone interview with Charlie Warzel from BuzzFeed. Please look at the original report here and don’t take words out of context.


When I talked to Charlie, I was raising a problem that most influence algorithms face today—the fact that most of them can be gamed easily. I talk about this in my own blog post here, too—The Influence Irony – Influence Engine Optimization. However, I also suggested a fix to this problem in the following post—Adaptive Influence Model: Fixing the Influence Irony—hoping that the influence vendors (Klout included) would adopt this method. It was under this context that I surface my concerns for most existing influence vendors, because building a system that is bullet proof to gaming is nearly impossible—it is very computational intensive and would require computing infrastructures comparable to that of Google to building something close to such an adaptive influence algorithm. But that is not what we are trying to do with Klout’s big data infrastructure.


The Klout score is certainly isn’t perfect, even as of today, and I’m well aware of that. I haven’t change a bit about my perspective on Klout’s score being the standard of influence—IMHO it isn’t, and it (as well as other influence vendors) are still suffering from many problems I pointed out. But that doesn’t mean it can’t improve. More importantly, that doesn’t mean the machinery (i.e. the big data infrastructure) behind the scene, the talent, and the data they are continuing to collect are useless for what we are trying to do. Besides, we haven’t even talk about how we plan to use Klout’s big data infrastructure yet. That will come next week—part 2 of this post, so please be patient.  😉  


Every company have their strength and weakness just like people do. Saying Klout’s score isn’t perfect and therefore the company is worth nothing is as narrow-minded as concluding a person is worthless when he didn’t get straight A’s in school. So let’s not be prejudiced and embrace the future with an open mind.


Lithium Alumni (Retired) Lithium Alumni (Retired)
Lithium Alumni (Retired)

Hello @daverage 


Thank you for commenting on my blog. I appreciate your honesty. And it is OK to be vocal about your dislike and distrust about anything. On social, specially, here on my blog, everyone is encourage to voice their opinion. That is the whole point of being social. So here is my attempt to address your questions and concerns.


I agree that the Klout score has probably been taken too seriously in the industry. However, I don’t think that is a problem unique to Klout though. Any startup in any industry will try very hard to get everyone to take them seriously. Lithium did, too, and so did many other start upsin gamification, big data, mobile, etc. That is a necessary step to establish a business. That fact that people are take them seriously is a sign of success of their marketing—had to give them credit for that.


However, the fact that the industry is taking a score (as well as other new technologies—gamification, big data, etc.) too seriously is a failure of the industry to learn, comprehend, and understand what these new technologies really means to their business. Instead of looking under the hood and really understanding how influence (as well as gamification, big data, etc.) actually works, too many decisions in the industry are made based on marketing materials, competitive pressure, simply following the trend, and a lot of hype. This is precisely why I  want to write my blogs on these topics—influencers, gamification, big data, etc. Someone has to provide a more objective view on what these new technologies really mean to business and how they really work under the hood.


Now, back to the topic of influence. First, I would recommend that you take a look at a post on some of the fundamental concepts on influence—What is Influence, Really? – No Carrot, No Stick, No Annoyance, No Trick. In that post, we cited one definition of "influence" from the Webster’s dictionary—"the power or capacity of causing an effect in indirect or intangible ways : sway." But what does indirect, and intangible means? This is actually very important because, it means that influence is something that people can affect, but not directly or easily.


If people can affect it directly and easily, then people will game it, leading to what I call the influence irony. If people cannot affect it at all, then there is no point showing people that score if they cannot change it. So a good influence algorithm must find that narrow balance that reflect people’s true capacity to influence others.


This is easier said than done. So occasional algorithmic changes are inevitable. If you actually write code, you’ll know that no one can gets any complex algorithms perfect in 1 shot—it’s usually an iterative process over many years to perfect and fine tune such algorithm. And these are complex algorithms that probably have probably over millions of lines of code.


The problem is, of course, it impacts the final score, which consumers see—sometimes quite dramatically. But there are ways to overcome the sense of having no control over one’s score. One simple way is by re-computing one’s historical data with the new algorithm, and showing both the score under the old algorithm and the new algorithm, so people can get a sense of what changes are due to algorithm, and what changes is really a reflection of their behavior. I must say that this is not a new problem and financial and business intelligence software have encounter this problem long ago; and they have develop best practices on dealing with these problems. What I’ve described is just one of the simplest way to address this problem.


To answer your final question, what’s in it for the consumers. The answer is Lithium provides the context.


Simply looking at a score (e.g. I see that your K=61, and my K=57) offers no context in how these number should be interpreted or compared. In fact, it gives the false sense that these numbers are directly comparable. Do those score means that you are more influential than me? Maybe, maybe not. The reality is that you are probably more influential than me under some context—some domain of expertise, to certain groups of people, in certain geographical location, etc., but not under another context. These scores are not really comparable at all.


What does Lithium’s community provide?—that context you need to interpret the score. You know well that gamification has a long history of usage in communities. When you give people a badge or when someone achieve a rank, that is not so different from getting a score. But community members have the context to interpret these rank and badges, because they are part of it. In fact, these gamification feedback creates a sense of trust in the community, guiding new members to trusted content and other reputable members of the community. With more social signal that is contextually specific, we can certainly improve the score. Moreover, we can even create a contextual influence score, which is probably better described as a reputation score. That is the simple answer.


OK, this reply is getting long. So I’ll stop here. In Part 2 of this post, I will talk more about our vision, which is to create a 2 sided platform that even the playground between consumers and brands—much like those in the sharing economy (e.g. airbnb, uber, sidecar, etc.). So more concrete answers to your question about what’s in it for the consumer is coming in Part 2. Stay tuned…


Occasional Commentator
Occasional Commentator
Thank you @MikeW that may be the most complete answer to anything ever! I also appreciate the irony of a gamification person such as myself (not to mention a developer / coder) being so distrustful of this! What you have said gives me enough to now sit and watch patiently as things unfold. Good luck, it should be interesting seeing what you can all produce together.
Lithium Alumni (Retired) Lithium Alumni (Retired)
Lithium Alumni (Retired)

Hello @daverage 


Thank you for the vouch of confidence. 


I will definitely do my best to make something better for both the consumers and the brands out of what we've acquired. It's not going to happen tomorrow, as we have much work to do ahead of us. But I'm looking forward to the challenge. So thank you for being patient with us.


Thank you for the conversation here.

Look forward to seeing you again on Lithosphere.


Lithium Alumni (Retired) Lithium Alumni (Retired)
Lithium Alumni (Retired)

@thehackjob  Being a falvor of the month is not a factor here. It's about data and it's about the algos painfully engineered to derive insights from that data. The more algos you have/the better algos you have the more you master the data. Comsumer/customer data... these days it's really big and broad and mostly online. How do we capture that data and present it in a way that makes the biggest positive impact for a business or organization? Now we can do that even better. That's the bottom line. Some may not agree. But that's ok.

Occasional Commentator
Occasional Commentator

Thanks a lot @MikeW  for the article, it is very interesting. I just have a request, can you give me more detail about your definition of "integral of people’s behavior distribution "

Lithium Alumni (Retired) Lithium Alumni (Retired)
Lithium Alumni (Retired)

Hello @medhi


Thank you for asking. I'm sure a lot of people will have similar question. But that was intended to be technical and mathematical, because machine learning folks virtually speaks a different language of their own just as mathematicians, statisticians, etc. So be warned that this is rather complex, and there is no simple explanations. But I’m going to try and it give it my best attempt to see if it clarifies things.


First I will use a simple example where people can only take 2 actions. These maybe (viewing a message, and posting a message). So the level at which people view and post can be denoted as a point on a 2-dim space (x-y axis)—this is for 1 person. Different people will have different levels of viewing and posting behavior, which means their viewing and posting behavior will be represented by a different dots on this (viewing level vs. posting level axis).


So when you have a lot of people (a population) you will have many dots representing all the different viewing and posting behavior of the population under observation. If you actually have data on how much people view and how much they post and plotted them as described above, you will find that these dots are not distributed evenly. There are regions that are more dense, which means more people exhibit a certain types of viewing/posting behavior. And there are also regions that are less dense, meaning only a few people exhibit those particular levels of viewing/posting behavior. This creates a 2-dim distribution (or bivariate distribution, or joint distribution) of people’s viewing and posting behavior.


So the integral of this distribution, depending on the boundary condition, is just the volume under this 2-dim distribution. If you integrate over the entire 2-dim plane, the integral will always be 1, because it is a probability distribution. However, usually we will need to integrate over some region or only 1 of the variable, and these are specified by the boundary conditions. The integral over people’s behavior distribution is useful in estimating the probability of people taking a certain action or exhibiting a certain behavior (e.g. likelihood to buy, probability of referral, etc.)


If you are getting this so far, that is great. I hope I didn’t lose too many people.


The above case is overly simplistic because we assumed people can only exhibit 2 different behaviors. In the more general case, people can take many different actions—hundreds and thousands of different actions. And in the physical world, people’s behavior is virtually infinite. So instead of having a distribution on a 2-dim plane, you have a probability distribution in these high dimensional (possibly infinite dimensional) spaces. We cannot visualize more than 3-dimensional spaces, but they are just vector space, and they can have any dimensions4, 5, 100, 5000, 458693, or even infinite.


So the definition you wanted—the integral of people’s behavior distribution is just the hyper-volume in these space that is bounded by the boundary condition of the integral. FYI, integral here is the integral in calculus.


So there you go. I hope this is not making it worse. But if you are mathematically/statistically savvy, you will understand what I’m talking about. And if your work is in machine learning, you can be talking about these all day, every day.


OK, thanks again for asking for the detail. I'm glad someone ask, b/c I love talking about them. I just hope I didn't lose too many readers with this answer.


Hope to see you again next time.


Occasional Commentator
Occasional Commentator

Thanks a lot, that was a very clear explaination of "integral of people behavior distribution" and I just love when things are explained by the math. Robot Happy


To go one step further, I have one remark. In your exemple if we consider that P(viewing) is the probability of viewing and P(posting) is the probability of posting,


you assumed that P(viewing) P(posting) are independant variables. But what if they are not ?

Lithium Alumni (Retired) Lithium Alumni (Retired)
Lithium Alumni (Retired)

Hello @medhi


Thx you for the comment and asking another great question. I'm glad to hear that my explanation is clear to you. I also love it when things are explained in math and stats. I just hope that more people would appreciate that.


Ok now, to answer your question about the independence of P(viewing) and P(posting). They are actually NOT independent. In fact, in most case they are highly correlated. That is why we have to model the joint distribution in the 2-dim plain (or the joint distribution in a high-dim hyper volume). If it were independent, life would've been much simpler, b/c the probability factors and you can just model each one independently and model the joint distribution as a product. But that is NOT the case with most social data. Everything is highly correlated. This makes modeling people's behavior distribution very difficult and requires a lot more data, but it's also what makes it fun.


Alright, I hope this addresses your question. 


You obviously have some stats background. I'm curious what you do, as I always like to engage with like-minded people.


Again, thx for asking the tough questions. I hope to see you in future discussions.


Occasional Commentator
Occasional Commentator

Thanks a lot for your quick reply, that was fast ! Smiley LOL

I have #Math background and I studied statistics in my engineering school. Now I am in charge of Digital e-Influence and buzz monitoring at Orange. So hypervolume, infinite dimensions and eigenvectors are familiar to me.


So if P(viewing) and P(posting) are not independant, as far as I can remember my statistics lessons, we have to adress this problem from a Bayesian perspective, right?

Lithium Alumni (Retired) Lithium Alumni (Retired)
Lithium Alumni (Retired)

Hello @medhi,


Thx for continuing the discussion.


I always respond to my blog, although not this fast. I'm actually traveling now, so I guess my time zone actually helped in this case... I must have just checking my blog when you posted. Usually, I like to respond within the week, b/c I will have the weekend to catch up.  😉


Glad to hear that you are making great use of your math and stats background.


Now to your question...


When the distributions are not independent, you just need to model the full joint distribution all at once. Bayesian perspective has to do with the conditional distribution, which is different than the base distribution. If you recall, the Bayes Theorem basically says the joint distribution is symmetric under variable exchange


   P(post, view) = P(view, post)


But we have the factoring rules for non-independent distributions. That is the joint distribution = the conditional distribution x the marginal distribution:


   P(post, view) = P(post|view) x P(view)

   = P(view, post) = P(view|post) x P(post)


So if you rearrange the terms, you get Bayes Theorem, which lets you estimate one of the conditional from the other.


   P(post|view) = P(view|post) x P(post) / P(view)


Note that the factoring rules for non-independent distributions is true regardless of whether P(post) and P(view) are independent or not. If they are indeed independent, then P(post|view) = P(post), and likewise P(view|post) = P(post), so we get back the factoring rule when the distributions ARE independent, where the joint = just the product of the marginals.


   P(post, view) = P(post) x P(view)


So, to answer your question. You DON'T actually have to address the problem from a Bayesian perspective. There are many frequentist methods that allow you to do density estimation of the join distributions where the variables are not independent. But Bayesian methods are definitely very popular now a days, b/c it's actually simpler than frequentist methods when you have a lot of data.


Alright, hope this helps...


See you again next time.


Khoros Guru
Khoros Guru

I really enjoy to see and read those USA-France strong mathematical ties @MikeW and @medhi !!!

Looking forward to continue this discussion with you two at LiNC'14 ! And by the way, i wouldn't mind if i'll be in a less pure mathematical way then ! Smiley Wink

Lithium Alumni (Retired) Lithium Alumni (Retired)
Lithium Alumni (Retired)

Hello Arnaud,


Any discussion is welcome, no matter how purely mathematical it may be.


See you soon at LiNC!