Republicans vs. Democrats: Pareto charts of unduplicated Twitter reach

A couple of days ago I did a little more analysis on Republican and Democratic Congresspeople on Twitter. Pareto chart showing unduplicated reach for US congressTowards the end of the post, I realized that the unduplicated reach pareto chart that I’d built would only make sense if the US were a one-party state (or to be fair, if both parties had a single issue that they were united in wanting to promote.)

So — wanting to make this a little more representative — I went back and produced two charts; one showing Republican unduplicated reach (which follows a typical 80:20 distribution)…

Pareto chart showing unduplicated reach of Republican Twitterers in the US Congress
Continue reading

Republicans still outperforming Democrats on TweetCongress

Three weeks ago (and at the prompting of my colleague Eddie Garrett who heads up Porter Novelli DC’s digital team) I mapped out the interconnections between US Congress Tweeters. We’d been working on a Twitter crawler and it seemed like a good opportunity to test things out on a new data set.

This is a follow-up post. Once again it was prompted by a third party: Christie Findlay at Politics Magazine asked whether it would be OK to print a copy of one of the maps in their March edition. I’ve heard that three weeks are a long time in politics, so I thought I’d better run the crawl again just in case. Also I’ve got a new crawler that uses the proper Twitter API (I can see some of your eyes glazing over you know. Just skip ahead when that happens.) I’d tried it out on the Porter Novelli data set, but welcomed a chance to try it on something more meaty.

So yesterday morning before work I ran the crawl. I use the excellent Tweet Congress as my source of information about which congress people are on Twitter.
Continue reading

Porter Novelli Twitter folk – the 80/20 rule

Last weekend I posted a chart of Porter Novelli Twitter folk and their followers. If you read it, you’ll recall that I was dissatisfied by what it implied about the collective reach of Porter Novelli twitterers.The pareto chart should look more like this
Well, thanks to a long-ish train journey to Bolton and back, I was able to fudge a little perl script together to look through the data to find and remove everything other than the first instance of a follower. Let’s make that a little clearer. Let’s say that we’re looking at three Twitter people, Alice, Bob, and Carol. The first thing to do is to see who follows them:

alice bob carol
bob
carol
dave
xerxes
yasmine
zeus
alice
carol
edward
william
xerxes
yasmine
zeus
alice
bob
frank
william
xerxes

Now we need to rank them in order of “who has the most followers” (also known as “popularity” as it happens). Here I’ve done that from left to right. Bob has the most followers and Carol the fewest.

bob alice carol
alice
carol
edward
william
xerxes
yasmine
zeus
bob
carol
dave
xerxes
yasmine
zeus
alice
bob
frank
william
xerxes

And finally we go through from left to right removing all followers who have already shown up on someone else’s list.

bob alice carol
alice
carol
edward
william
xerxes
yasmine
zeus
bob
carol
dave
xerxes
yasmine
zeus
alice
bob
frank
william
xerxes

Bob, being at the top of the list gets to keep all his followers which may seem unfair. But it’s not unfair if the question we’re trying to answer is “how do I reach as many people as possible by speaking to as few people as possible?” That is, I’m looking for reach (marketing people often express themselves in terms of “reach” — or the number of people who are exposed to a message — and “frequency” — or the number of times the average person is exposed to that message.)

Looking at the example above, we can see that Alice really delivers an incremental benefit of two new people, and Carol only reaches one new person. That gives us a much better idea of how valuable the most popular person (Bob) really is.

Applying this to the Porter Novelli data set

Clearly it would be extraordinarily boring to perform the process described above for the 205 people in the Porter Novelli data set that I want to analyse. But the analysis script that I wrote (with plenty of help from the perl monks) goes through exactly these steps. It’s a pretty straightforward job, ranking and deduping. Here’s what we get.

Pareto chart showing unduplicated reach among Porter Novelli Twitter Users

This makes much more sense than the last run. According to the Pareto principle, roughly 80% of the effects should come from 20% of the causes. Here we see that 20% of the Porter Novelli Twitter users (marked in black) account for slightly more than 80% of the reach (marked in red.) It’s pretty much a text-book example. Things are as they should be, I suppose.

More to the point, we can now assign appropriate value to coverage at the head of the graph. This is of great value when thinking about our media planning and engagement

By the way — if you’d like a copy of either the Twitter follower API query engine (it’s a well-behaved command-line thing that was developed by the excellent Joachim Larsen) or the slightly shonky perl script that I wrote on the train, you have only to ask: I’ll be pleased to share. Send me a tweet at @mediaczar and I’ll send you the scripts.

Porter Novelli Twitter folk ranked by number of followers

Yesterday I did a little work with the TwitterCounter API. Today I’ve gone a little further and (purely as an experiment) ranked a list of Twitter people in Porter Novelli by the number of their followers.

What happens if we chart this? Here’s a kind of Pareto chart showing users ranked in order of followers and the total reach that we get at each stage.

Porter Novelli Twitter people ranked by #followers

If you’ve seen this kind of thing before, it looks wrong, doesn’t it? That red curve should be steeper at the beginning and have longer flatter asymptote. If you’ve ever heard of the 80/20 rule this is one of the graphs that describes it. Normally the head of the graph (the first 20% of the x-axis) controls around 80% of the value while the tail (the remaining 80% of the x-axis) controls around 20% of the value. If you’ve ever heard about the long tail, it’s this tail that Chris Anderson et al. are talking about.

What’s wrong with the data?

It’s not so much the data as what I’ve not done with it. There must be many, many duplicated connections here. So now I need to write something that will go through the followers of all the Porter Novelli Twitter usernames in ranked order, and only count unique (or unduplicated) followers.

I’m hoping that when I re-do the chart, it will look something more like this:

The pareto chart should look more like this

Map of Porter Novelli Twitter folk on 20th Jan 2008

Three days after my last map, and after lots of internal nudging from our CMO Marian Salzman, her two helpers Tikva Morowati and Zeenat Duberia and local activists like Juriaan Vergouw, Burçu Kaptan, and Umut Ersoy, the map of Porter Novelli people on Twitter looks very different. (You can click on any of the maps in this post to go to their Flickr page where you can choose to see them at larger sizes.)
Continue reading

Map of Porter Novelli Twitter folk on 17th Jan 2008

Map of Porter Novelli people on Twitter 17 jan

Marian Salzman (our Global CMO here at Porter Novelli) has had the inspired idea of getting people in the agency to tweet about the most exciting story this week (probably) — the inauguration of Barack Obama

You can see the results of the experiment on her blog.

I’m all for this, of course, for several reasons:

  1. It gets new people onto Twitter
  2. It helps us create a stronger network among Porter Novelli twitterers
  3. It means I can track who at the agency is on Twitter

Continue reading

Map of US Congress twitter folk

This is a map of the current US congressmen and women who are currently on Twitter (you can click it to see a bigger map where you can read the names.) The direction of the arrows show who follows whom, and the size of the blobs indicates how “popular” a given congressperson is among their twittering peers (where “popular” means something like “is followed by many of their peers.”) Colours indicate party affiliation (for those of you who — like me — don’t live in the ‘States and who — like me — need reminding from time to time, the Democrats are the blue dots.)

Network of US Congress twitterers showing "citation frequency"

Network of US Congress twitterers showing citation frequency. Click for bigger.


A cursory glance at this map shows a few things:
Continue reading

Some Twitter Social Network Analysis

On November 10th, Stephen Davies collected together a list of “UK PR people on Twitter” According to PostRank, this (and his earlier post, “UK Journalists on Twitter“) are the most popular posts on his blog.

Then a couple of days later, Stephen Waddington pushed that list through TwitterGrader to come up with his list of “Top 50 UK PR people by Twitter influence

A couple of weeks ago, I was looking for a seed list with which I could test our “whitelist” and “canonify exception” rules on Rufus (the network analysis tool that Porter Novelli has been working on for the past six months.) This isn’t the right place to go into it, but to put it simply, the whitelist restricts the search to domains that are on the list (like a guest list), and the canonify exception list stops Rufus from chopping the subdomains or directories off the list (without this, a site like sethgodin.typepad.com would just show up as typepad.com or en.wikipedia.org/wiki/Social_network_analysis would show up as wikipedia.org. Rufus, by the way, is named after the George Carlin character in Bill & Ted’s Excellent Adventure.

My colleague, Tim Hoang used to work with Stephen W., so he sent him the image. Wadds then posted “the map on his blog“. My flickr page has never had so much activity.

Here’s the original graph:

High network density in twitter UK PR community

Lots of people started drawing conclusions about the nature of PR, or the nature of Twitter from the graphs. There was lots of interesting speculation. Some people thought that this demonstrated how introverted the twitter crowd is. Others thought that it showed how introverted the PR/Social media crowd is. Others seemed to think that it didn’t matter.
Continue reading

Mapping the social graph of weight loss groups

These are the graphs from some research on weight-loss groups on Facebook. I’ve processed the data so that:

  1. the size of dot is related to "total number of friends" – this only works where a user’s friends are publicly visible – quite often they aren’t, and I haven’t checked to see what the incidence of this privacy setting is generally and specifically
  2. all isolates (i.e. those users with no relationship to any other within the group) have been removed.
  3. personal weight loss support group

    This is the network graph of relationships on a personal weight loss support group. A college student set this up to support her own goals. She told me: " For my group, I just started it out by inviting all of my friends and then some people joined the group who found it in a search, I think. I am amazed by the amount of support I receive from random people who encourage me to keep on going. There are some spammers on the group who are just there trying to sell stuff and that gets annoying, but I know I can’t avoid them."

    unofficial weightwatchers support group

    This is the network graph of relationships on an unofficial weightwatchers group on Facebook. You can see that there are hardly any member-get-member relationships here. My friend Valery Yakubovich (who has a professorship in this sort of thing at Wharton) says:
    "It’s very common that organizations and interest groups become foci for personal networks. In fact, I believe that joint activities are the prevalent mechanism of tie formation."

    But it doesn’t look like it here. Looks to me that – while people may form relationships around special interests – they don’t mirror these on Facebook. Say I suffer from Meniere’s Disease (apparently true) and I participate in a Meniere’s support forum (not true at present), I don’t necessarily make those people my Facebook friends…

    blog-related support group

    Another example of the "not many personal relationships" graph for a weight loss support group on Facebook, this time, it’s the Facebook adjunct of a popular weight loss blog.

    How do people get information on weight loss? After a few interviews, I think the answer is like this:

    1. Influencers are "pull", rather than "push" resources (I’m thinking of going on a particular product, so I mention it casually to several friends to gauge consensus/temperature. One or more of them tell me "oh yes, I’ve heard of that", and one tells me "yes, My friend tried that, and lost 20lbs") This is not an active market. Most people won’t be evangelising, and evangelising behaviour may even appear suspicious.
    2. That said, people trust strangers to an extraordinary degree. Friend-of-friend endorsement is readily accepted, as is the anonymous commentary on boards & groups. Bloggers are slightly less trustworthy, it seems – because most of them have an axe to grind.