Friday, October 9, 2009

Batching Mechanical Turk Jobs at the Command Line

Here at Voxilate, we use Amazon Mechanical Turk for a lot of things - from crowdsourcing translations to gathering market research to deciding where to go for lunch! Some of the jobs are one-time deals that we don't want to spend a lot of time on.

Typically, you can submit, monitor, and approve Mechanical Turk jobs (HITs) using the API or the web interface. The web interface requires a lot of mouse-clicking and .csv file generation and it really doesn't allow for full control. The API is complex and doesn't lend itself to quick experimentation. There are also a set of command-line tools provided by Amazon, but these just wrap the API and can be a little clunky.

Wouldn't it be nice if you could just type a single line and watch Workers' responses come back in real-time like this:

% mturk -j 3 -t "What is your favorite color?"
blue
blue
red

Or this:

% mturk -t "Who is in this picture?" albert.jpg bea.jpg
Albert Einstein
Bea Arthur

We thought it would be pretty cool, so we created a quick script using Python and the boto library.

How does it work?

Given a command like:

./mturk.py -p .50 -j 10 -t "Is there text in this graphic?" -D "If you see text in this graphic, type \"text.\" If not, type \"no text.\"" -w 300 -A IsThereText01.jpg IsThereText02.jpg

We'd create two distinct HITs - one for each file (JPEGs, in this case). The -j parameter controls the number of assignments we submit for each HIT; in this case, we'd run 10 iterations of each HIT, for a total of 20 assignments. Each assignment would pay out 50 cents (specified by -p .50), and because we used -A for auto-approval, you'd be out $10 if workers accepted and completed all of your hits!

By default, the script uses the Mechanical Turk sandbox so that you ensure that your usage is correct before spending real money. The -w option ensures that the script waits the specified number of seconds for HIT results to print to standard out, then exits.

In our example, we used JPEG files, but the input files themselves can be:
  • Blank - You don't want to pass any data, you just want to ask a question with a single answer.
  • Image, Audio, Video - show the user a picture, video, or sound and have them return a response.
  • HTML - This allows you to have full control over the HIT. Whatever <form> tags you use in the HTML file now posts results to Mechanical Turk. They are hosted on Amazon S3 as an "external question" and appear in the IFRAME of the HIT. (More on this in a future blog post.)

We've found this really kind of useful - for instance, we've used this script as part of a Makefile to implement a complicated workflow. Let's run through an example.

First, let's check our status:

$ ./mturk.py
You are in test mode.
Funds remaining: [$10,000.00]
There are 0 jobs active.

Note that we see $10,000 because we're using the sandbox. If you had active jobs, you'd see the jobs listed by HIT ID and filename. Running mturk.py in live mode (./mturk.py -l) will show you how much money you really have:

$ ./mturk.py -l
You are in LIVE MODE.
Funds remaining: [$41.38]
There are 0 jobs active.


Next, let's submit a job. In this batch, we're using the sandbox (no -l!), and "paying" 4 cents for each assignment, with two iterations of two graphics for tagging. We should auto-approve submissions 360 seconds after completion and wait 400 seconds before exiting the script.

Here's what we run and see on the command line:

$ ./mturk.py -p .04 -j 2 -t "Tag this graphic" -D "Type keywords, separated by spaces, that you think best describe the image. Please enter at least 2 and no more than 5 keywords." -k "voxilate tagging" -d 120 -e 300 -a 360 -w 400 Namazu.jpg SydneyOperaHouse.jpg
You are in test mode.
Uploaded to http://com.voxilate.mturk.s3.amazonaws.com/home/jen/Turkpipe/Namazu.jpg
/home/jen/Turkpipe/Namazu.jpg: Created HIT 936MY1Y9KY4ZTJH4ZXTZ.
Uploaded to http://com.voxilate.mturk.s3.amazonaws.com/home/jen/Turkpipe/SydneyOperaHouse.jpg
/home/jen/Turkpipe/SydneyOperaHouse.jpg: Created HIT M1N2Y04MX34RW21CGY6Z.
/home/jen/Turkpipe/Namazu.jpg: 0/2 assignments completed.
/home/jen/Turkpipe/SydneyOperaHouse.jpg: 0/2 assignments completed.


And here's what we see in the Mechanical Turk Sandbox:



Next, we'll play worker and grab some of these HITs:



And the other:



We'll then see changes at the console, as it polls for completion:

/home/jen/Turkpipe/Namazu.jpg: 1/2 assignments completed.
/home/jen/Turkpipe/SydneyOperaHouse.jpg: 1/2 assignments completed.

Because we submitted two iterations, the Turk won't let me do *all* of the tasks myself, so we wait for other workers to jump in to complete them (on the Sandbox, bug a friend or use another account). When complete, the script exits, dumping the following to the console:

/home/jen/workspace/Turkpipe/Namazu.jpg: 2/2 assignments completed.
/home/jen/workspace/Turkpipe/SydneyOperaHouse.jpg: 2/2 assignments completed.
catfish japanese art
japanese catfish painting
sydney australia opera house
Sydney opera hall Australia water

And that's it - Turking at the command line successful! Add a -l to the command (and an -o output_file to save results to file) and you're ready to go live!

9 comments:

  1. Yeah guys where's your code?

    ReplyDelete
  2. Sorry - never updated the blog post! It's at http://code.google.com/p/turkpipe/. Last tested with boto 1.8d.

    ReplyDelete
  3. Thanks for posting this code. Since it targets boto 1.8, I'm running this inside virtualenv. I did have one issue with the creation of an s3 bucket, but the following svn diff takes care of that:

    Index: turkpipe.py
    ===================================================================
    --- turkpipe.py (revision 2)
    +++ turkpipe.py (working copy)
    @@ -45,8 +45,9 @@
    description = None

    s3conn = Connection()
    -bucket = s3conn.get_bucket(bucketname)
    -if not bucket:
    +try:
    + bucket = s3conn.get_bucket(bucketname)
    +except boto.exception.S3ResponseError:
    print "The S3 bucket '%s' has been created." % (bucketname)
    bucket = s3conn.create_bucket(bucketname)

    ReplyDelete
  4. Couple of updates:
    1. Porting this to boto 2.0 is non-trivial (I ended up just using 1.8d)
    2. The s3 bucket used gives me 'access denied'. Not sure why (don't have any experience w/ s3). Renaming the bucket to something of my own fixed that problem.

    ReplyDelete
  5. Non-trivial's likely an understatement at this point. ;)

    I've been remiss in not adding idm's patch to trunk (thanks so much, idm!) - just added it!

    ReplyDelete
  6. OK, I've finally got it working in 1.8d. I had to change boto.mturk.connection.create_hit to add ('HIT', HIT) to the list of marker_elems (IE, which parameters boto cares about when parsing the response. That got everything working.

    A couple of other problems: GDMS kind of sucks (I had to remove the temporary files each time it crashed).

    Anyways: now it works, and I'm happy to fix this - should I just submit patches to the google-code project?

    I'll be using this code for a semester-long project I'm just starting, so happy to submit some patches (at least to get 1.8d working).

    ReplyDelete
  7. Hi, Alexey - we can add you as a committer if you'd like! Drop me a line @ jen-at-voxilate.com and we'll get you set up.

    ReplyDelete
  8. Wonderful blog & good post.Its really helpful for me, awaiting for more new post. Keep Blogging!

    ReplyDelete