I’m back from a restful Thanksgiving, recharged, caught up on sleep and ready for the next installment on CHI. For those of you in the US, I hope you also had a restful Thanksgiving!
In the past month, I’ve been writing about the new Community Health Index (CHI). So let’s wrap up by looking at some data today. To start, let me give you a concise bullet-point summary to recap what we’ve already discussed:
(A). Infrastructure change:
- Built on modern big data technologies
- Based on event-log data with rich contextual metadata
- Bot traffic is filtered out
(B). Algorithmic changes:
- More sensitive and more responsive
- Includes all parts of the community
- The raw health factors are normalized to quantile scores (a.k.a. health scores)
- The health scores are combined via a generalized mean
(C). Upcoming Features enabled by the new infrastructure and algorithm:
- Daily CHI
- Adaptive CHI
If you’ve missed the earlier posts or you just want the details, don’t worry. They are all accessible right here:
A Mathematical Challenge
With the new infrastructure (A1—A3), as well as the improved algorithms (B1—B4), there is no doubt that the final CHI score will also change. There are literally hundreds of intermediary calculations (possibly more) between a single user’s action within the community and the final CHI score. Any changes in any one of those intermediary steps could change the final CHI score.
However, we knew that a significant change in CHI score could create disruption to our customers’ community operation. Many of our clients are using their community in a very mission-critical way. Some even used CHI as a KPI to determine how their community teams are to be rewarded. So a drastic change in CHI could go as far as ruin someone’s hard earn bonus or raise serious question from their management.
Consequently, part of the design criteria for the new CHI algorithm is to minimize the statistical deviation from the old CHI. This is accomplished in the last step where we applied a linear function to scale and shift the combined health scores (i.e. the generalized mean) to a number between 0 and 1000. But since the old CHI score is already scaled between 0 and 1000, by minimizing deviation of the new CHI score from the old, the new algorithm will be automatically scaled to the desired range.
Although this insight allowed us to kill 2 birds with 1 stone, it’s not as trivial as it sounds. We still need to incorporate all the new features involving hundreds of computational steps across 500+ communities and thousands of variables, and each one of the steps or variables could cause a significant deviation from the old CHI score. Minimizing the deviation from the old CHI score while—at the same time—satisfying all the design requirements is equivalent to a massive simultaneous constrained optimization problem involving thousands of variables. We solved this optimization problem by minimizing the mean squared error (MSE) between the old and the new CHI score. And the result is manifested in the scaling and shift parameters we chose for the linear function that converts the combined health scores into the final CHI score.
Old CHI Score vs. New CHI Score
Although mathematical optimization will give us the solution that minimizes our loss function—in this case the MSE between the old and the new CHI—we still need to examine the solution to see whether it’s acceptable. It is likely and often the case that the minimal MSE solution is still not good enough. In those cases, we would have to make some tradeoff to relax some of the constraints, modify the design criteria, change the loss function, and iterate. We were lucky because we didn’t have to iterate too many times before arriving at a satisfactory solution. Finally, we hand tuned the parameters slightly to give the new CHI score just a little positive boost.
So how did the new CHI score change, and what was the deviation?
The figure to the right show a plot of the average new CHI score against the average old CHI score for a subset of communities where the old CHI is available. We compare the “average” CHI score because the new CHI is much more volatile than the old (see discussion below). Using the average will remove the noisy variations due to the increased sensitivity (which is part of the design criteria) that doesn’t represent a model failure.
As you can see, the new CHI score matches the old quite closely, scattering around the line of unity (i.e. the blue dash line, where the new CHI score matches the old exactly). Each red dot represents a community whose old CHI score is available for comparison. Recall from part 1 that our new data infrastructure (A1) enabled us to compute the new CHI score for many more communities than previously possible. So this chart only shows a subset of our communities. The mean absolute deviation (MAD) is 45.61 and MSE is 59.82 (represented by the 2 sets of green dash lines respectively). That means most communities can expect to see a baseline change of ~50 points from their old CHI score (note: this is only the average baseline change not taking into account of the volatility). If you squint, you can probably notice that there are more red dots above the blue dash line than below. That means more communities are getting a higher CHI score under the new algorithm than the old. This is a result of our hand tuning I mentioned earlier.
Increased Sensitivity = Increased Volatility
As part of the design criteria, we’ve made the new CHI more sensitive (B1) by removing the smoothing step and the history dependence. This means the new CHI will also be more volatile. Although sensitivity is often viewed as a feature where volatility is often viewed as a bug, they are really two side of the same coin. Just as you can’t pick up a stick without picking up both of its ends, you can’t increase the sensitivity of an algorithm without also increasing its volatility. So how volatile is the new CHI algorithm?
The figure to the left shows the histogram of week-to-week standard deviation for the new CHI scores. Clearly, the bulk of the distribution is less than 150, with an average of 62.72 and median of 51.56. However, a weekly change of 300 points is not impossible, albeit improbable. This means, the expected weekly variation of the new CHI score is roughly ~60 points on average. On top of the 50 points baseline change we discussed in the previous section, the increased volatility means that most communities could expect a total change of ~110 points on average.
It is precisely because of this increased volatility that we back-filled the CHI score for ~20 months, so you can see and understand how the new CHI scores compare to the old. The CHI algorithm has change significantly, so the change in the final score is nonlinear and non-trivial. Therefore, you shouldn’t expect the new CHI score to be a simple transformation of the old.
The Distribution of CHI Scores
The original CHI score was not only designed to scale between 0 and 1000, it was also engineered so the average across all the communities is ~500. Although we tried to minimize the deviation of the new CHI score from the old, the deviation is not zero. In fact, the minimum MSE solution still has a MAD of 45.61 points. The question is does this change the average CHI score?
Many people have asked what is the average CHI? What is a good score, and what is a bad score? To address this question, we need to examine the distribution of CHI scores across all of our communities. The figure to the right shows such distribution on a particular day (i.e. the first week of Nov). Clearly, the mean (491.31) and the median (522.46) are both very close to 500. And the interquartile range is between 312.57—647.02.
However, we already learned that the new CHI score is very volatile. The score can change as much as 300+ points from one week to another with an average of ~60 points weekly variations. So the distribution for one day may not be representative of the entire population. But since we have back-filled the CHI scores for ~20 month, we can examine the cross-community distribution of the temporal-averaged CHI scores.
The temporal average eliminates most of the weekly variation due to volatility and gives us a much better estimate of the baseline level of health for our communities. As you can see, this distribution only changed slightly from the previous one. There is clearly a concentration toward the mean, since the interquartile range has contracted to 346.33—612.80. However, the mean (483.70) and median (495.01) remains fairly close to 500.
More importantly, our previous segmentation of CHI scores is still valid:
However, keep in mind that this segmentation is somewhat artificial. The distribution of scores falls into a continuous range, not discrete buckets. Changing from 399 to 400 doesn’t really mean your community has suddenly switched from an unhealthy state to good health. This segmentation is merely a simplistic guideline, and you should refer to this distribution to see where you really stand.
I hope this gives you a sense of the work that has gone into the development of the new CHI algorithm. It is a non-trivial problem, particularly when all the changes must be made under the constraint of minimizing the deviation from the old CHI. I also hope the data presented in this post gives you a better understanding of the new CHI algorithm. The new CHI score has already been launched for a month now. And we have back-filled the historical CHI scores under the new algorithm for about 20 months, so you can see (within the LSI app) how it compares to your old CHI scores over time.
I’d love to know what you are seeing out there. Are your observations consistent with the data presented here? How did your new CHI score change? How much volatility are you seeing? Where do you stand compare to the rest of the communities out there?
That’s all for now. Best wishes for the festive season.
Michael Wu, Ph.D. is Lithium's Chief Scientist. His research includes: deriving insights from big data, understanding the behavioral economics of gamification, engaging + finding true social media influencers, developing predictive + actionable social analytics algorithms, social CRM, and using cyber anthropology + social network analysis to unravel the collective dynamics of communities + social networks.
Michael was voted a 2010 Influential Leader by CRM Magazine for his work on predictive social analytics + its application to Social CRM. He's a blogger on Lithosphere, and you can follow him @mich8elwu or Google+.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.