Using Amazon Mechanical Turk for Blog Post Research

For a post on Women and Smartphones, I searched for answers myself, hired a researcher, and still was missing data.  A colleague recommended Amazon Mechanical Turk.

According to Amazon Web Services, "Amazon Mechanical Turk is a marketplace for work that requires human intelligence. The Mechanical Turk web service enables companies to programmatically access this marketplace and a diverse, on-demand workforce. Developers can leverage this service to build human intelligence directly into their applications."

Frankly, I wasn't really hoping to "make accessing human intelligence simple, scalable, and cost-effective." I sought the fundamentals of an excellent blog  post – information with links to corroborate it.

The lofty, high-tech "human intelligence" terminology of MTurk did make me wonder if I were registering with the Borg, but the Amazon brand engendered trust, as did the word-of-mouth referral from my colleague.  When I do a swim workout, I don't test the water with a toe.  I jump in.  Online, I jump in, too – after very carefully reading the Terms of Service.  I signed up.

Amazon Mechanical Turk

Like with iStockphoto, one pays up front for credits and then uses them as desired.

This is a thorough description of Amazon Mechanical Turk on Wikipedia.  I am so not an expert on Mechanical Turk.  I struggled with learning the user interface, it continues to make me insane, the terminology seems obscure to me (a HIT is not an act of violence, but a Human Intelligence Task, translated in my mind as "the thing I want done"), and parts of it I just don’t get.  Ah, well.  Regardless, I’ve figured out how to access the expertise of others to assist me with my work creating blog posts.  I offer my inexpert, laboriously acquired suggestions.

  • Watch the “Get Started” and “Best Practices” videos and read the Best Practices Guide (.pdf). I didn’t.  I got needlessly frustrated and, as warned against in Myst, I thrashed.
  • Create HITs with one question per HIT. I didn’t.  I spent a lot of time creating a multi-question query that never got answered because it wasn't worth a Worker's time.
  • Create the first HIT and publish it and see what happens. I didn’t.  I got taught the same lesson about the imprecision of my wording multiple times very quickly in rapid succession.
  • Word questions and directions specifically.  I got taught by receiving information different from what I wanted in my first published HITs that I wasn’t helping MTurk Workers help me.
  • Pay to learn. I paid for tasks completed, even if the information wasn’t quite what I had in mind, because the Workers were answering my question rather than reading my mind.  I've only rejected two tasks because even my poorly worded questions were answered incorrectly.
  • Copy the results.  When a task is completed and the results are viewed for the first time, copy and paste the text of the results into a Word document or text file. I’m sure it’s an ID10T error on my part, but my attempts to view the results a second time have failed.  The "Export Results" tab generates an Excel file that contains some kind of data, but not the answer to the question I asked.
  • Let go of not knowing the person doing the work or being able to thank them for a job well done. MTurk interactions are as brief as Twitter tweets, but impersonal, anonymous and, by definition, mechanical. This feels incomplete to me, but I go with it.
  • Be embarrassed to offer too low of pay. Anonymity still requires integrity.  I was horrified to learn my inexperience resulted in HITs that paid $2.00 an hour. I now weigh the complexity of the HIT and carefully rate it to pay what I would pay someone I know – at least minimum wage.

Feel free to use this template I've devised through MUCH trial and error learning.  The question uses variables in hopes of showing how asking the basic questions of journalism can help formulate a specific question.

How many x did y in this locale in this year?

Please answer with a NUMBER, not a percentage.

[Insert text box to collect Worker's answer to question.]
 
Provide the URL of the web site that links to the source of that answer.

Please provide a URL to an original source, not to to a secondary source or aggregator of content.  The site you find needs to have links to other sources so I can prove that the number is reliable and accurate.

[Insert text box to collect Worker's link.]

Thank you very much!

Using Amazon Mechanical Turk definitely feels like being on Star Trek at times, but here's an example of why I treasure it and believe I have only begun to use its power. 

The number of women smartphone users in the U.S. in 2009 seems to be a highly guarded marketing research secret.  Neither I nor my researchers could find the answer anywhere.  An Amazon Mechanical Turk Worker responded to my query:

26,486,000 = 26.486 million|number of women in US 2009 =

155.8 million from this link = http://www.infoplease.com/spot/womencensus1.html

and the users of smartphone 17% from this link =

http://www.marketingcharts.com/interactive/mobile-phones-beat-pcs-for-young-women-9608/srg-technology-most-impact-life-women-june-2009jpg

so " 155.8 million * 17% = 26.486 million "

Whoever you are and wherever you are, Mechanical Turk Worker, receiving your intelligent answer felt like an act of humanity.

***

These may prove of further interest about Amazon Mechanical Turk:

The New Demographics of Mechanical Turk, Panos Ipeirotis
Some use Mechanical Turk to collect data for studies through surveys
The Ethics of Amazon.com's Mechanical Turk, The Chronicle of Higher Education
Views from a Mechnical Turk Worker and Turking4aLiving

Added 12/26/2010:

Using Mechanical Turk for Research, Michael Ewen, Carnegie Mellon University

Added 2/5/2011:
Mechanical Serfdom is Just That, Bloomberg Businessweek

Low-Cut Blouses at Work - Workplace Advice
Hav-a-Hokie iHood Hoodie

Speak Your Mind

*