Hadoop

FrameWork for running applications on large clusters built of commodity hardware (Scalability)

http://lucene.apache.org/hadoop/

mainly for Java

  • there's an example using Jython

  • also There are several ways to incorporate non-Java code. Hadoop Streaming permits any shell command to be used as a map or reduce function, and Hadoop is also developing C and C++ APIs and a SWIG-compatible pipes API.

Yahoo is pushing it

http://highscalability.com/product-hadoop

http://developer.yahoo.net/blog/archives/2007/07/yahoo-hadoop.html

Bill De Hora: http://www.dehora.net/journal/2008/07/06/3-12-minutes-to-sort-a-terabyte-hadoops-code-structure/

Python use?


Edited:    |       |    Search Twitter for discussion