CDH 4.4.0

Cloudera’s 100% Open Source Hadoop Platform

CDH is Cloudera's open source software distribution and consists of Apache Hadoop and additional key open source projects to ensure you get the most out of Hadoop and your data.

It is the only Hadoop solution to offer unified querying options (including batch processing, interactive SQL, text search, and machine learning) and necessary enterprise security features (such as role-based access controls).

Please note: CDH requires manual installation from the command line.
For a faster, automated installation download Cloudera Manager.

CDH Packaging Information

Each CDH release series is made up of a collection of CDH project packages that are known to work together. The package version numbers of the CDH projects in each CDH release are listed in the following table.


CDH Version 4.4.0 Packaging

To view the overall release notes for CDH Version 4.4.0 (CDH4.4.0), click here.

CDH4 Project

Package Version

Tarball Version

Release Notes

Changes File

Apache Hadoop 2.0

hadoop-2.0.0+1475

hadoop-2.0.0-cdh4.4.0.tar.gz

here

here

DataFu

pig-udf-datafu-0.0.4+22

datafu-0.0.4-cdh4.4.0.tar.gz

here

here

Apache Flume 1

flume-ng-1.4.0+23

flume-ng-1.4.0-cdh4.4.0.tar.gz

here

here

Apache HBase

hbase-0.94.6+132

hbase-0.94.6-cdh4.4.0.tar.gz

here

here

Apache Hive

hive-0.10.0+198

hive-0.10.0-cdh4.4.0.tar.gz

here

here

Apache HCatalog

hcatalog-0.5.0+13

hcatalog-0.5.0-cdh4.4.0.tar.gz

here

here

Hue

hue-2.5.0+139

hue-2.5.0-cdh4.4.0.tar.gz

here

here

Apache Mahout

mahout-0.7+21

mahout-0.7-cdh4.4.0.tar.gz

here

here

Apache MapReduce (MRv1)

hadoop-0.20-mapreduce-2.0.0+1475

(none) (none) (none)

Apache Oozie

oozie-3.3.2+92

oozie-3.3.2-cdh4.4.0.tar.gz

here

here

Apache Sentry (incubating)

sentry-1.1.0

sentry-1.1.0-cdh4.4.0.tar.gz

here

here

Apache Pig

pig-0.11.0+33

pig-0.11.0-cdh4.4.0.tar.gz

here

here

Apache Sqoop

sqoop-1.4.3+62

sqoop-1.4.3-cdh4.4.0.tar.gz

here

here

Apache Sqoop 2

sqoop2-1.99.2+85

sqoop2-1.99.2-cdh4.4.0.tar.gz

here

here

Apache Whirr

whirr-0.8.2+15

whirr-0.8.2-cdh4.4.0.tar.gz

here

here

Apache ZooKeeper

zookeeper-3.4.5+23

zookeeper-3.4.5-cdh4.4.0.tar.gz

here

here

1 Flume 0.9.4 is not included in the CDH4 distribution. It has been replaced by Flume 1.x, and you are encouraged to transition to this new version. However, for a limited time you can access a CDH4-compatible version of Flume 0.9.4 package at:

CDH4 Project

Package Version

Tarball Version

Release Notes

Changes File

Apache Flume

flume-0.9.4+25.52

flume-0.9.4-cdh4.1.3.tar.gz

here

here

What's New in CDH4.4.0

Oracle JDK 7 Support

With Cloudera Manager 4.7 and CDH4.4, Cloudera now supports users running applications compiled with Oracle JDK 7 (JDK 1.7), with the following restrictions:

  • All CDH components must be running the same major version (that is, all deployed on JDK 6 or all deployed on JDK 7). For example, you cannot run Hadoop on JDK 6 while running Sqoop on JDK 7.
  • All nodes in the cluster must be running the same major JDK version: Cloudera does not support mixed environments (some nodes on JDK6 and others on JDK7).
To make sure everything works correctly, symbolically link the directory where you install the JDK to /usr/java/default on Red Hat and similar systems, or to /usr/lib/jvm/default-java on Ubuntu and Debian systems.

Apache Flume

  • The new Morphline sink provides a heavyweight ETL (Extract, Transform, Load) framework using Cloudera Morphlines, and can write events out to Apache Solr (FLUME-2070).
  • The File Channel Integrity tool can now verify integrity of individual events in the File Channel, and remove corrupt events (FLUME-1586).
  • Communication between Avro Sink and Source can be secured using SSL (FLUME-997).
  • The File Channel now does group commits; this can improve performance in some cases (FLUME-1917)

Apache Hive

As of CDH4.4.0, HiveServer2 supports secure impersonation for JDBC clients and BeeLine. See Using Kerberos Authentication with HiveServer2. See also Apache Sentry (incubating).

Hive also includes the following improvements:

Hue

  • Search application: You can search from Solr, Solr Cloud, and Cloudera Search, and customize the results with your own style and facets. Multiple indexes are supported, as well as query highlighting, and sorting.
  • HBase Browser: The HBase Browser application allows you to quickly browse huge tables and access any content. You can also create new tables, add data, modify existing cells, and filter data with the auto-completing search bar.
  • Sqoop2 application: The Sqoop2 application allows you to import and export data easily between databases and HDFS, and in a scalable way. The Job Wizard hides the complexity of creating Sqoop jobs and the dashboard provides a live progress indicator and log access.

Hue also includes the following improvements:

Apache Oozie

As of CDH4.4, the Oozie Hive action can be configured to work with HiveServer 2 using BeeLine. For more information, see the Hive Actiondocumentation.

Apache Sentry (incubating)

CDH4.4 includes Sentry, which enables role-based, fine-grained authorization for HiveServer2 and Cloudera Impala. It provides classic database-style authorization for Hive and Impala.

Apache Sqoop

As of CDH4.4, Sqoop provides integration with HCatalog. See the section on HCatalog in the Sqoop User Guide.

CDH 4.x Requirements and Supported Versions

Supported Operating Systems

CDH4 provides packages for Red-Hat-compatible, SLES, Ubuntu, and Debian systems as described below.

Operating System

Version

Packages

Red Hat compatible



Red Hat Enterprise Linux (RHEL)

5.7

64-bit


6.2

64-bit, 32-bit

6.4

64-bit

CentOS

5.7

64-bit


6.2

64-bit, 32-bit


6.4

64-bit

Oracle Linux with Unbreakable Enterprise Kernel

5.6

64-bit

6.4

64-bit

SLES



SLES Linux Enterprise Server (SLES)

11 with Service Pack 1 or later

64-bit

Ubuntu/Debian



Ubuntu

Lucid (10.04) - Long-Term Support (LTS)

64-bit


Precise (12.04) - Long-Term Support (LTS)

64-bit

Debian

Squeeze (6.0.3)

64-bit

  Note:
  • For production environments, 64-bit packages are recommended. Except as noted above, CDH4 provides only 64-bit packages.
  • Cloudera has received reports that our RPMs work well on Fedora, but we have not tested this.
  • If you are using an operating system that is not supported by Cloudera's packages, you can also download source tarballs from Downloads.

Supported Databases

Supported JDK versions

Supported Internet Protocol