Welcome back. Last week I attended my 5th LiNC—Lithium’s annual customer conference. It was another overwhelming and over stimulating experience.
Since it is Friday and it’s late, I’m going to post something simple and fun today. As there are already many nice recaps of the conference (see below), I will just poste a few data points in the form of a picture (or infographics) to give you a sense of the types of conversation and interaction that took place on twitter during LiNC14.
I usually like to collect tweets during great conferences that I’ve participated to visualize the interactions among the attendees and the greater community. So I’ve collected tweets mentioning #LiNC14 over the last weekend and did some very simple analysis on them.
Who’s Tweeting at #LiNC14?
Here is a histogram of tweet frequencies for individuals who have tweeted #LiNC14 during the conference.
Don’t see yourself? Click on the image above to download a hi-res version.
You may say wait, that’s just a tag cloud. Yes, it is indeed! A tag cloud is simply a creative ways of showing a histogram. The exact same data could have been shown using a more traditional histogram as shown here:
The tag clouds simply map the tweet frequency data to the font size, so you can get a more visual indication of the numerical tweet frequency data. Since numerical data typically requires more cognitive power to process than visual data, tag clouds are often more intuitive and easier to understand than traditional histograms. Thanks to Tagxedo for the cool layout.
Who Got the Most Digital Love on Twitter?
Here is another histogram (word cloud) showing the frequencies of which people get retweeted or mentioned from tweets mentioning #LiNC14.
Again, you can click on the image to get a hi-res version.
As you can see, Jonah Peretti (@Peretti), our keynote speaker, was retweeted and mentioned a lot even though he hasn’t been tweeting about #LiNC14—he didn’t show up in the previous word cloud.
What are People Talking About #LiNC14 on Twitter?
To answer this question, I had to do a little more work (actually not that much, it’s really easy). I simply strip away all the #hashtags and twitter @handles from the tweets I collected and processed the remaining text as I did earlier. This is because #hashtags and twitter @handles often bias the result, since they are actually not part of our natural language (Recall Jimmy Fellon’s Late Night show with Justin Timberlake on #Hashtag? If you haven’t seen it, you are missing out on a good LOL).
Since #hashtag and twitter @handles are sort of an artifact of the twitter platform, when I want to understand the content of what people are talking about I usually cleanse the data by stripping them away. Otherwise, we won’t get as much insights. Most likely we will get #LiNC14 and @Peretti as the most frequently mentioned terms, which we already knew (not insightful). Remember, insights are information that you don’t already know.
Click on the image to get a hi-res version.
How are People Interacting on Twitter?
The above should give you a high level overview of what’s going on during LiNC14 (on twitter). We must be careful not to overgeneralize beyond the scope of our data. But what if you want something more granular? I often like to create a twitter interaction graph showing who interacted with whom, either through retweets, replies, or mentions. This is the first step of social network analysis. I also calculated various graph metrics—something simple as degree centrality, and more complex graph metrics such as page rank, clustering coefficient, etc.
In this visualization, I’ve map the page rank of users on this small interaction graph to the size or their twitter avatar—bigger avatars correspond to higher page rank. Again, I’ve mapped numerical data (i.e. page rank) to something more visual (i.e. avatar size), to give a more intuitive understanding to the data. There are probably hundreds of visualizations I can create from this fun little data set. If you are interested, the data is here. Create some visualization and have some fun. You may find something interesting and surprising.
I also clustered the graph, so people who interacted a lot are grouped together—shown through the same font color for their twitter handles. This is a rough indication the degree of relatedness between the various twitter accounts. For example, @KatyKeim is right next to @LithiumTech, and @Peretti is next to @BuzzFeed, because these accounts are highly related.
Click on the image to get a searchable pdf version.
Graph clustering is not a trivial problem for large graphs (fortunately this graph is not large at all). This problem is similar to the math challenge behind the design of the 9/11 memorial—trying to arrange the 2,983 victims not alphabetically or by company, nation, or hometown, but by social and personal connections. At first glance, the position of each twitter avatar seems random, but they are not. Related avatars should be positioned close together and should have the same color for their twitter handles. Moreover, you need to do that in a way that is aesthetically pleasing.
I didn’t try to optimize the layout as Jer Thorp (the data artist at New York Times) did for his adjacency algorithm—used in the design of the 9/11 memorial, because that is actually a very computationally intensive process, and I’m only doing this for fun. I’ve provided a pdf of the infographic, where you can search for your twitter handle and see if you are indeed placed close to other people related to you in some ways. Then you can tell me how poorly I’ve done with the rudimentary clustering algorithm I used.
Finally, if you want the more typical recaps, whether it’s in text or video, we have them, too. Just visit the links below. There are probably more coming next week.
A picture is worth a thousand words. This is one of those fun posts sharing some fun stuff I do over the weekend out of curiosity. Along the way, I hope you learn something, whether it’s about analyzing twitter data, data visualization, the challenge of data clustering, or simply about another year of amazing LiNC14.
Next time, I like to revisit the topic of gamification because some people have expressed interest on that topic and commented on one of my earlier blog post on intrinsic motivation. So let me know what you like to see here and stay tuned.
Michael Wu, Ph.D. is Lithium's Chief Scientist. His research includes: deriving insights from big data, understanding the behavioral economics of gamification, engaging + finding true social media influencers, developing predictive + actionable social analytics algorithms, social CRM, and using cyber anthropology + social network analysis to unravel the collective dynamics of communities + social networks.
Michael was voted a 2010 Influential Leader by CRM Magazine for his work on predictive social analytics + its application to Social CRM. He's a blogger on Lithosphere, and you can follow him @mich8elwu or Google+.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.