7 items tagged “mapreduce”
BashReduce. Map/Reduce in Bash is no longer a joke project (if it ever was)—Richard Crowley is extending it and using it for analysis at OpenDNS.
28th June 2009, 3:03 pm
Finding similar items with Amazon Elastic MapReduce, Python, and Hadoop streaming. Tutorial for running Hadoop jobs on Elastic MapReduce using Python and the 2005 Audioscrobbler dataset.
7th April 2009, 9:19 am
Amazon Elastic MapReduce (via) Hadoop as a service. Basically a web based GUI around Hadoop—you could roll this yourself on EC2 but for a small markup on regular EC2 prices you get to avoid the extra work setting everything up. Data processing scripts can be written in Java, Ruby, Perl, Python, PHP, R, or C++ and are loaded in to S3 before firing off the job.
2nd April 2009, 10:25 am
Cascading. A Java API abstraction layer over Hadoop that lets developers think in terms of pipes and filters rather than map/reduce. The Cascading developers claim that this model is easier to understand and less error prone.
1st October 2008, 1:22 pm
Python + Hadoop = Flying Circus Elephant. Last.fm have released Dumbo, a Python module that lets you easily write Hadoop map/reduce tasks using Python and generators.
31st May 2008, 2:14 pm
Writing An Hadoop MapReduce Program In Python. Hadoop (the open source map/reduce framework) can interact with any program that reads from stdin and outputs on stdout—so it’s trivial to drop in Python scripts for the map and reduce steps.
9th October 2007, 11:33 am
CouchDB: Thinking beyond the RDBMS. CouchDB is a fascinating project—an Erlang powered non-relational database with a JSON API that lets you define “views” (really computed tables) based on JavaScript functions that execute using map/reduce. Damien Katz, the main developer currently works for MySQL and used to work on Lotus Notes.
3rd September 2007, 9:48 am