Distribution Details

Components CDH2 CDH3
Apache Hadoop

Reliable, scalable distributed computing

v0.20.1 + 169* v0.20.2 + 923*
Apache Hive

SQL-like language and metadata repository

v0.4.1 +14* v0.7.1 +42*
Apache Pig

High-level language for expressing data analysis programs

v0.5.0 +11* v0.8.1 +28*
Apache HBase

Hadoop database for random, real-time read/write access

N/A v0.90.4 +49*
Apache Zookeeper

Highly-reliable distributed coordination service

N/A v3.3.3 +12*
Apache Flume

Distributed service for collecting and aggregating log and event data

N/A v0.9.4 +25*
Apache Sqoop

Integrating Hadoop with RDBMS

N/A v1.3.0 + 5*
Apache Mahout

Library of machine learning algorithms for Hadoop

N/A v0.5 +9*
Apache Whirr

Library for running Hadoop in the cloud

N/A v0.5.0 +4*
Apache Oozie

Server-based workflow engine for Hadoop activities

N/A v2.3.2 +27*
Fuse-DFS

Module within Hadoop for mounting HDFS as a traditional file system

v0.20.1 + 169* v0.20.2 + 923*
Hue

Browser-based desktop interface for interacting with Hadoop

N/A v1.2.0.0 +114*
Supported Operating Systems CDH2 CDH3
CentOS CentOS 5 CentOS 5, CentOS 6
Debian N/A Squeeze, Lenny
Oracle N/A Oracle Linux 5.6 w/ Unbreakable Enterprise Kernel
Red Hat RHEL 5 RHEL 5, RHEL 6
SUSE N/A SLES 11
Ubuntu Hardy, Intrepid, Lenny, Jaunty, Karmic Lucid, Maverick
Supported Build Infrastructure CDH2 CDH3
Apache Maven N/A Yes
Supported Cloud Platforms CDH2 CDH3
Public Cloud Rackspace, Amazon EC2, SoftLayer Rackspace, Amazon EC2, SoftLayer

*What’s in a version?

v0.X.X = underlying release. +XX = patches added (backported) on top of the release.