Simon Willison’s Weblog

Subscribe

Items tagged amazonwebservices in Apr, 2009

Filters: Year: 2009 × Month: Apr × amazonwebservices × Sorted by date


Finding similar items with Amazon Elastic MapReduce, Python, and Hadoop streaming. Tutorial for running Hadoop jobs on Elastic MapReduce using Python and the 2005 Audioscrobbler dataset. # 7th April 2009, 9:19 am

Amazon Elastic MapReduce (via) Hadoop as a service. Basically a web based GUI around Hadoop—you could roll this yourself on EC2 but for a small markup on regular EC2 prices you get to avoid the extra work setting everything up. Data processing scripts can be written in Java, Ruby, Perl, Python, PHP, R, or C++ and are loaded in to S3 before firing off the job. # 2nd April 2009, 10:25 am