Tuesday, February 7, 2012

Open Source technologies for distributed web application development

Apache Hadoop : Reliable, scalable distributed computing
Apache Hive : SQL-like language and metadata repository
Apache Pig : High-level language for expressing data analysis programs
Apache HBase : Hadoop database for random, real-time read/write access
Apache Zookeeper : Highly-reliable distributed coordination service
Apache Flume : Distributed service for collecting and aggregating log and event data
Apache Sqoop : Integrating Hadoop with RDBMS
Apache Mahout : Library of machine learning algorithms for Hadoop
Apache Whirr : Library for running Hadoop in the cloud
Apache Oozie : Server-based workflow engine for Hadoop activities
Fuse-DFS : Module within Hadoop for mounting HDFS as a traditional file system
Hue : Browser-based desktop interface for interacting with Hadoop