Blog Post

Release Notes
4 MIN READ

The 90-9-1 Rule in Reality

MikeW's avatar
MikeW
Lithium Alumni (Retired)
15 years ago

Dr. Michael Wu, Ph.D. is Lithium's Principal Scientist of Analytics, digging into the complex dynamics of social interaction and online communities.

He's a regular blogger on the Lithosphere and previously wrote in the Analytic Science blog.

You can follow him on Twitter at mich8elwu.


 

If you've ever managed a community you've probably heard of the "90-9-1 rule". If you have observed a community closely, you have probably seen it in action.

Soon after a community launches, users begin to participate, but each user participates at a different rate. The minute difference in participation levels is accentuated over time, leading to a small number of hyper-contributors in the community who produce most of the community content.

The 90-9-1 rule simply states that:

  • 90% of all users are lurkers. They read, search, navigate, and observe, but don't contribute
  • 9% of all users contribute occasionally
  • 1% of all users participate a lot and account for most of the content in the community

But how real is this rule? Do all communities follow this rule consistently? If not, how far off is the deviation? Is the proportion really 90:9:1, or is it more like 70:25:5, or 80:19.99:0.01? Let's find out...

Lithium has accumulated over 10 years of user participation data across 200+ communities, so we can address this question empirically with rigorous statistics. Rather than complicating the issue with the lurkers, I choose to analyze only the contributors (i.e. the 9% occasional-contributors and the 1% hyper-contributors). The proportion between these two groups of participants should be 9:1 or equivalently 90:10 according to the 90-9-1 rule.

The 9-1 Part of the 90-9-1 Rule

So the 90-9-1 rule excluding the lurkers says that:

  • 90% of the contributors (which is 9% of all users) are occasional-contributors.
  • 10% of the contributors (which is 1% of all users) are hyper-contributors, who generate most of the community content.

What does the data tell us? On average, the top 10% of contributors (the hyper-contributors) generate 55.95% of the community content, and the rest of the 90% (the occasional-contributors) produces the remaining 44.05% of the content.

With my statistician hat, you know I can't possibly be satisfied with just the average! So I plotted the distribution of content contributed by occasional-contributors versus the hyper-contributors across all communities. The standard deviation is 13.02%.

Please note: The reason you only see 143 communities here, is because I've excluded communities that are less than 3 months old (these communities are too young that their participation dynamics are not stable enough for the analysis).

As you can see from the data, the hyper-contributors can contribute anywhere from about 30% to nearly 90% of the community content with an average of 55.95%. This is certainly a substantial percentage (considering the fact that it is generated by only 10% of the contributors), so the 90-9-1 rule "sort of" holds. But, to be rigorous, it depends on what do you mean by "most" of the community content.

If "most" meant at least 30% of the community content, then the 9-1 part of the 90-9-1 rule holds for 99.30% of our communities. If you meant at least 40% of the community content, then 89.51% of our communities satisfy this rule. But if "most" meant at least 50% of the community content, then only 65.73% of our communities are described by this rule.

Turning the Problem Around

This gives us a convenient spot to turn the problem around and look at the 90-9-1 rule from another perspective. We can define rigorously what "most" means (e.g. at least 30% of the community content), then calculate the fraction of contributors who generated these content and treat them as the hyper-contributors. We can then compare and see how far off we are from the expected ratio of 9:1.

Averaging across 143 communities, we see that if we define "most of the community content" to be "at least 30% of the total content," then the fraction of participants who contributed this amount ranges from 0.32% to 5.14% with an average of 2.73%. That means, on average, hyper-contributors consist of roughly 2.73% of the contributing population, so the remaining 97.27% of the participants are occasional contributors. And the ratio of hyper- to occasional-contributors is about 97:3, far from the expected value of 9:1.

If instead, we define "most" to be "at least 40%" of total content, then we get roughly 5.07% hyper-contributors on average across 143 communities. Now the ratio of hyper- to occasional-contributors is about 19:1, which is closer but still quite far off the expected ratio of 9:1.

If we defined "most" to be "at least 50%" of the total content, then the group that contributed this amount (which qualifies them to be hyper-contributors) is about 9.35% of the participants. This gives us a ratio that is much closer to the expected value of 9:1 on average. However, the variability is also very large. Even under this simple criterion of contributing at least 50%, the fraction of participants who contributed this amount may vary from less than 1% to ~18% of the participants. That means the ratio between hyper- and occasional-contributors may be anywhere from 99:1 to about 5:1.

So is 90-9-1 a hard and fast rule? Definitely not! Not even the 9-1 part of it. But it is certainly a great rule of thumb when looking at or explaining community data. And it tells us that participation in communities is highly skewed and unequal, and there is a small fraction of hyper-contributors who produce a substantial amount of the community contents.

Next time I am going to start to dive deeper into the contribution level of the hyper-contributors, your community's real superusers.

Updated 5 months ago
Version 9.0
  • MikeW's avatar
    MikeW
    Lithium Alumni (Retired)

    Hello Heather,

     

    Thanks for the comment. This prompts for another blog on the analysis of participation data under different segmentation of communities. I am already thinking the following:

     

    1. B2B vs. B2C communities.
    2. Support, vs. Marketing/Sales vs. Innovation communities.
    3. Maybe even some broad coarse segmentation of industries.

    4. First year vs. Older communities

     

    Let me know what else you like to see.

  • Dr. Wu,

     

    Yes, I meant 65% - thanks for catching that. 

     

    Your points are well made - when I considered the 65% , it was the percentage of registered users who had posted at least once.   I agree that if one were to consider all the unregistered user sessions in the whole of the community (which would greatly increase the denominator) then the percentage of contributors and hyper contributors would shrink significantly.  I elected not to look at it that way for two reasons:

     

    1) I wanted to restrict my consideration to those had taken the time to register an ID and log in - taking the first step to joining the community and in some way demonstrating their intent formally to do so.

     

    2) As registered users (myself included) may visit and browse the community at times without logging in, I considered that I might be double counting these actions, and incorrectly attributing them to the pool of lurkers since there is no way to know without the person having logged in, or some complex IP address lookup, whether or not they were already a member in the community, and whether or not they had posted at least once.

     

    I did my "back of the envelope" analysis based on a long term cumulative view of the community - all users over time.   I played a bit with the date function on the social graph tool and went back to the beginning of the community and browsed through at various points in time over several years.  I could see some of the super contributors on the graph, and a few have fallen off if they left the community, having not contributed enough to make it into the all time top 20, as seen the cumulative view.

     

    So, through the aperture of a point in time, the percentage contributions could change quite a bit depending on the overall size of the community and the number and characteristics of those hyper-contributors.  Both views - cumulative vs point in time have their own merits.  I'm going with cumulative overall when discussing characteristics of our community.  However the point in time view can help spot emerging talent / rising stars, so keeping a 30-60-90 day view is handy as well..

     

    Mark

  • MikeW's avatar
    MikeW
    Lithium Alumni (Retired)

    Hello Mark,

     

    Thanks again for the discussion.

     

    Yes that is true, some register users can still visit the community without login. Did you activate the auto-login option in your community. I have that activated and everytime I come back to the community, it just automatically log me in, so that the times when I have to visit the community from another computer, that require me to login manually is actually very small fraction of the time. Maybe I should look at how many registered user actually leave this option on or turn it off.

     

    I totally agree. Certainly that both cumulative and the point-in-time summaries have their own merit. But the subject of academic debate is whether the 90-9-1 rule pertains to the cumulative or the point-in-time. And it seems that the point-in-time data fits the 90-9-1 rule better--at least in our data. The data I presented actually uses a 60 day running window every week, even though I also analyze data using cumulative data as well as other window length. Despite that, the precise proportion of the 9:1 is still quite far off, and if I presented the cumulative data, the proportion will be even further off.

     

    So the debate is not that whether the cumulative view has any more or less merit. It just means that if you use cumulative data, you shouldn't be surprised that they are way off from the 90-9-1 rule. And it is because the delicate balance that is required to maintain the 90:9:1 proportion is very difficult to achieve, even if they were true at any point in time, they will not be true when looking over an large window cumulatively. And since this calculation is about the 90-9-1 rule, I just like to point that out as a potential pitfall as some may not be aware of it.

     

     

  • I have to say I'm very unsure about this 90-9-1 rule. What about users who don't post much but rarely read things ever?

    Its fairly common on communities to have people sign up, ask one question, get their answer and never be seen again. Or even for them just to create an account but never use it- facebook is full of such orphaned accounts.

    This concentration solely on the very actives seems to me to be the only way to make sense of it, darn the 90, its just 9-1.

  • MikeW's avatar
    MikeW
    Lithium Alumni (Retired)

    Hello Tyr,

     

    Thank you for commenting.

     

    Let me clarify that the 90 usually does not include those orphan accounts or those who ask one question and left. Those people are really transient visitors who are not really part of the community (even they have an account). The 90 is really the lurkers, who belongs to the community by virtue of repeatedly consuming the community content, just do not participate.

     

    That being said, a lot of people who analyze this simple take the registered user with no post as the 90 because it is simpler. But that is not correct! You must take a look at users who have login within a certain time period to assess whether they are still part of the community.

     

    Moreover, lurker include those who repeatedly visit the community for content, but never even register. Getting an accurate estimate of lurker population involves several data sources that need to be distilled. And we have all that data, we just need to analyze it and make it presentable. So it is more difficult. And when I get around to crunching those data, I will post the result.

     

    This post, however, only focuses on the 9-1 part of the 90-9-1 rule as you've mentioned.

     

  • Our ratio from 90-9-1 is far away. Meaning that 100 000 unique visitors you should get 9000 members. Actually we get fraction of it.

     

    Our community is open, all of the content is visible for anyone so they don´t have to register. Does this effect some how?