Method to get the dates of first posts using Amazon Mechanical Turk

When London PR blogger Melanie Seasons started her blog two and a half years ago, the subject of her first post was her first post from her MySpace blog. In fact, she took most of her content from there as well. She calls her first post “a cop-out first post of another first post”, but I think that she might have spun it as a “metapost”.

In some ways, the post you’re reading now could be another metapost — a post about first posts. But it’s really about new ways of working.

I know about Melanie’s first post because I’ve been carrying out some quantitative research using first posts. I took a user-generated list of UK PR blogs that I helped curate last October, and attempted to identify the date of the first ever post for each blog.

This is a task that’s almost impossible to automate. Getting the newest post is a cinch for a computer – the oldest post not so much. And yet it’s relatively simple for a human to perform the task – generally it’s just boring and repetitive (although I challenge you to find the first post on Jed Hallam’s blog, Rock Star PR). I’m not one of those people who enjoys repetitive tasks, so I decided to take this opportunity to set up the Magic Bean Lab’s first experiment; to test the efficiency of various alternative labour sources.

Method 1: e-lancers

freelancer.com masthead
I used Freelancer.com: a well-established e-lance and outsourcing marketplace I’ve used several times in the past. As we get more used to buying things that we can’t see over the web, the e-lance market has become a no-brainer.

I employed two e-lance researchers. I’ve found that running researchers in parallel on projects like these reduces the need for overmuch error-checking. Quality Assurance (QA) rapidly becomes the biggest overhead in any project like this.

My quick-and-dirty QA process runs as follows:

  • Compare results side by side
  • If the results agree, accept this as the correct answer
  • If the results disagree, do some checking myself.

In the event, the two freelancers agreed in 88% of cases, and I only had to check the remaining 12%.

An obvious problem: if both freelancers agree on an incorrect answer, I won’t check it. This happened in approximately 4% of cases during this test (I’m working with a known set of data here, or — of course — I wouldn’t be able to tell that.)

Another problem which you wouldn’t see otherwise: it’s fairly time intensive. I have to post the project, wait for the bids to roll in, assess the bids and so on. The whole process took about 48 hours from start to finish.

Still, it beats doing the work myself, and it’s fairly scaleable.

Method 2: Amazon Mechanical Turk

mechanical turk masthead
I’ve been noodling around with Amazon Mechanical Turk for a few months now. Mechanical Turk, for those of you who don’t know, is named after a late eighteenth century hoax that — while it purported to be a machine that could play chess — was in fact powered by a person sitting inside a box.

Amazon describes its Mechanical Turk service as follows:

Amazon Mechanical Turk is a marketplace for work that requires human intelligence. The Mechanical Turk service gives businesses access to a diverse, on-demand, scalable workforce and gives workers a selection of thousands of tasks to complete whenever it’s convenient.

So instead of paying a freelancer a fee to perform all the research, I can split the job into its constituent parts that Amazon calls HITs (Human Intelligence Tasks) and pay a fractional piecework fee (say $0.10) for each blog on the list. This has two advantages:

  1. It’s much faster. Instead of one person working through the list in sequence, the list is processed in parallel;
  2. It’s cheaper. There seems to be a lower limit to the bids on Freelancer.com of around $30 per job; and
  3. It’s extremely scaleable. Using Amazon’s API, I can embed humans into automated processes. More on this in later posts, I hope.

The only problem? In the words of one Turker I’ve interviewed, “There are few spammers in mturk who spoil the mturk community.” In other words, I have more QA problems. It seems that people are more inclined to be dishonest when smaller sums and greater anonymity are involved. I’m sure that a behavioural economist like Dan Ariely could explain this in detail — but for the moment, please let’s just accept that most people are more inclined to cheat you out of $0.10 than $100.

So here’s the first thing I tried.

  • Run three Turkers in parallel
  • Accept the earliest date received

Clearly there are problems with this method. But not as many problems as you might think. Results were approximately 80% accurate. And it took less than 2 hours from start to finish: that’s 46 hours faster than the e-lancer method. And remember it’s costing me less, although I’m paying for answers that the method deems to be inaccurate.

But surely I could do better.

Method 3: Amazon Mechanical Turk and “elementary game theory”

To be honest, I probably know less about game theory than you do. I know about the Prisoner’s Dilemma and the Three White Hats (which probably isn’t even game theoretic; only going to show how little I know.)

But I’m using “game theory” both very loosely and very specifically to mean “I know a little about how people try to game systems, and I’m going to do my damnedest to use that knowledge to hack their behaviour.”

So here’s what I told the Turkers:

Each HIT is being performed three times, and the results will be checked against each other:

  1. When all three HITs agree, a $0.10 bonus will be paid to all workers.
  2. When only two HITs agree, those two will be accepted as the correct answer. The third result will be rejected.
  3. If all three HITs disagree the requester may consider rejecting all three.
  4. Occasionally, it may be the case that it is very hard for you to find the correct date. We want to make it worth your while to find this. Please note this in the Additional Comments box together with the process you used to discover the date and we will consider paying a discretionary bonus of $0.50 on top of any other bonuses. For particularly hard HITs, therefore, the total potential upside is $0.80

Please, take your time, and get the right answer. It will be worth it for you, and worth it for the other workers performing this task!

My hope was that I would discourage the spammers. Here’s the logic:

Spammers know that they won’t get paid unless at least 1 other person agrees with them. For this to happen, either:

  1. They’d have to be lucky enough stumble across the right date by mistake (at odds
    of around 3650:1 for blogs created in the past decade); or
  2. They’d need to be accidentally matched by another troll (same odds.)

Incidentally, it turns out this “majority rule” is unpopular with the Turkers; mostly because they fear the spammers don’t read the instructions, and will still queer the pitch (which seems like a legitimate and logical strategy for spammers).

Nevertheless the task was completed in much the same time as the previous method; and with startling results. The answers received were 97% accurate. That’s more accurate than the freelancers (just under 96%), at a lower cost, and in a fraction of the time.

Comparison of the three methods

In the table below, Cost per Response is adjusted for accuracy.

Method Cost Accuracy Cost per Response
1 (e-lancers) $61.00 66 (95.7%) $0.92
2 (mturk simple) $51.15 55 (79.7%) $0.93
3 (mturk + “game theory”) $54.92 67 (97.1%) $0.82

facebook comments:

10 thoughts on “Method to get the dates of first posts using Amazon Mechanical Turk

  1. Pingback: eLancing research « Rage on Omnipotent

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>