CDH 4.7.0

Cloudera’s 100% Open Source Hadoop Platform

CDH is Cloudera's open source software distribution and consists of Apache Hadoop and additional key open source projects to ensure you get the most out of Hadoop and your data.

It is the only Hadoop solution to offer unified querying options (including batch processing, interactive SQL, text search, and machine learning) and necessary enterprise security features (such as role-based access controls).

Please note: CDH requires manual installation from the command line.
For a faster, automated installation download Cloudera Manager.

CDH Packaging and Tarball Information

Each CDH release series is made up of a collection of CDH project packages that are known to work together. The package version numbers of the CDH projects in each CDH release are listed in the following table.
  Note:

To see the details of all the changes and bug-fixes for a given component in a given release, make sure you read the Changes file as well as the Release Notes, following the links in the tables below.

CDH Version 4.7.0 Packaging and Tarballs

To view the overall release notes for CDH 4.7.0, click here.

Component

Package Version

Tarball

Release Notes

Changes File

DataFu

pig-udf-datafu-0.0.4+11

Tarball

Release notes

Changes

Apache Flume

flume-ng-1.4.0+97

Tarball

Release notes

Changes

Apache Hadoop

hadoop-2.0.0+1603

Tarball

Release notes

Changes

Apache HBase

hbase-0.94.15+113

Tarball

Release notes

Changes

Apache HCatalog

hcatalog-0.5.0+13

Tarball

Release notes

Changes

Apache Hive

hive-0.10.0+258

Tarball

Release notes

Changes

Hue

hue-2.5.0+240

Tarball

Release notes

Changes

Apache Mahout

mahout-0.7+15

Tarball

Release notes

Changes

Apache Oozie

oozie-3.3.2+102

Tarball

Release notes

Changes

Parquet

parquet-1.2.5+71

Tarball

Release notes

Changes

Parquet-format

parquet-format-1.0.0+4

Tarball

Release notes

Changes

Apache Pig

pig-0.11.0+43

Tarball

Release notes

Changes

Apache Sentry (incubating)

sentry-1.1.0+22

Tarball

Release notes

Changes

Apache Sqoop

sqoop-1.4.3+94

Tarball

Release notes

Changes

Apache Sqoop2

sqoop2-1.99.2+99

Tarball

Release notes

Changes

Apache Whirr

whirr-0.8.2+15

Tarball

Release notes

Changes

Apache ZooKeeper

zookeeper-3.4.5+25

Tarball

Release notes

Changes

What's New in CDH 4.7.0

This is a maintenance release that fixes bugs, and provides a new Hue capability.

Hue

Hue now supports Active Directory, which allows nested groups. Before CDH4.7, Hue supported only LDAP.

Bug Fixes

This release incorporated a number of upstream bug fixes including the following.

Flume
  • FLUME-2357- HDFS sink should retry closing files that previously had close errors
Hadoop
HBase
  • HBASE-10514 - Forward port HBASE-10466, possible data loss when failed flushes
  • HBASE-10257 - Master aborts due to assignment race
  • HBASE-8912 - AssignmentManager throws IllegalStateException from PENDING_OPEN to OFFLINE
Hive
  • HIVE-6005 - BETWEEN is broken after using KRYO
  • HIVE-4222 - Timestamp type constants cannot be deserialized in JDK 1.6 or less
  • HIVE-5380 - Non-default OI constructors should be supported for backwards compatibility
  • HIVE-5263 - Query Plan cloning time could be improved by using Kryo
Hue
  • HUE-1962 - Support int row key from Hive table
  • HUE-2060 - LDAP import commands carry incorrect import statements
  • HUE-1992 - Username to lowercase switch for RemoteUserDjangoBackend
  • HUE-1873 - Result data not HTML encoded
  • HUE-1897- workflow ids have double trailing slashes
HDFS
  • HDFS-6289 - HA failover can fail if there are pending DN messages for DNs which no longer exist
  • HDFS-5944 - LeaseManager:findLeaseWithPrefixPath can't handle path like /a/b/ right and cause SecondaryNameNode failed do checkpoint
  • HDFS-6191 - Disable quota checks when replaying edit log
  • HDFS-4943 - WebHdfsFileSystem does not work when original file path has encoded chars
  • HDFS-5064 - Standby checkpoints should not block concurrent readers
  • HDFS-6160 - TestSafeMode occasionally fails
  • HDFS-5496 - Make replication queue initialization asynchronous
  • HDFS-5438 - Flaws in block report processing can cause data loss
  • HDFS-5255 - Distcp job fails with hsftp when https is enabled in insecure cluster
  • HDFS-5074 - Allow starting up from an fsimage checkpoint in the middle of a segment
  • HDFS-4879 - Add "blocked ArrayList" collection to avoid CMS full GCs
MapReduce
  • MAPREDUCE-5877 - Inconsistency between JT/TT for tasks taking a long time to launch
Oozie
  • OOZIE-1699 - Some of the commands submitted to Oozie internal queue are never executed

CDH 4.7.0 Requirements and Supported Versions

Supported Operating Systems

CDH4 provides packages for Red-Hat-compatible, SLES, Ubuntu, and Debian systems as described below.

Operating System

Version

Packages

Red Hat compatible



Red Hat Enterprise Linux (RHEL)

5.7

64-bit


6.2

64-bit, 32-bit

6.4

64-bit

CentOS

5.7

64-bit


6.2

64-bit, 32-bit


6.4

64-bit

Oracle Linux with default kernel and Unbreakable Enterprise Kernel

5.6

64-bit

6.4

64-bit

SLES



SLES Linux Enterprise Server (SLES)

11 with Service Pack 1 or later

64-bit

Ubuntu/Debian



Ubuntu

Lucid (10.04) - Long-Term Support (LTS)

64-bit


Precise (12.04) - Long-Term Support (LTS)

64-bit

Debian

Squeeze (6.0.3)

64-bit

  Note:
  • For production environments, 64-bit packages are recommended. Except as noted above, CDH4 provides only 64-bit packages.
  • Cloudera has received reports that our RPMs work well on Fedora, but we have not tested this.
  • If you are using an operating system that is not supported by Cloudera's packages, you can also download source tarballs from Downloads.

Supported Databases

Component

MySQL

SQLite

PostgreSQL

Oracle

Derby

Oozie

5.5

8.4

10.2, 11gR2

Default

Flume

Default (for the JDBC Channel only)

Hue

5.5

Default

8.4

11gR2

Hive

5.5

8.4

10.2, 11gR2

Default

Sqoop

See Note 2

 –

See Note 2

See Note 2

Sqoop 2

See Note 3

 –

See Note 3

See Note 3

Default

Notes

  1. Cloudera's recommendations are:
    • For Red Hat and similar systems:
      • Use MySQL server version 5.0 (or higher) and version 5.0 client shared libraries on Red Hat 5 and similar systems.
      • Use MySQL server version 5.1 (or higher) and version 5.1 client shared libraries on Red Hat 6 and similar systems.

      If you use a higher server version than recommended here (for example, if you use 5.5) make sure you install the corresponding client libraries.

    • For SLES systems, use MySQL server version 5.0 (or higher) and version 5.0 client shared libraries.
    • For Ubuntu systems:
      • Use MySQL server version 5.1 (or higher) and version 5.0 client shared libraries on Lucid (10.4).
      • Use MySQL server version 5.5 (or higher) and version 5.0 client shared libraries on Precise (12.04).
  2. For connectivity purposes only, Sqoop supports MySQL5.1, PostgreSQL 9.1.4, Oracle 10.2, Teradata 13.1, and Netezza TwinFin 5.0. The Sqoop metastore works only with HSQLDB (1.8.0 and higher 1.x versions; the metastore does not work with any HSQLDB 2.x versions).
  3. Sqoop 2 can transport data to and from MySQL5.1, PostgreSQL 9.1.4, Oracle 10.2, and Microsoft SQL Server 2012. The Sqoop 2 repository is supported only on Derby.

Supported JDK versions

CDH4 is supported with Oracle JDK.
  Important: JDK 1.7
As of Cloudera Manager 4.7 and CDH4.4, Cloudera now supports users running applications compiled with Oracle JDK 7 (JDK 1.7), with the following restrictions:
  • All CDH cluster nodes and services must be running a supported JDK 7 version.
  • All CDH cluster nodes and services must be running the same major version (that is, all deployed on JDK 6 or all deployed on JDK 7). For example, you cannot run Hadoop on JDK 6 while running Sqoop on JDK 7.
To make sure everything works correctly, symbolically link the directory where you install the JDK to /usr/java/default on Red Hat and similar systems, or to /usr/lib/jvm/default-java on Ubuntu and Debian systems.
  • For JDK 1.6, CDH4 is certified with 1.6.0_31. The minimum supported version is 1.6.0_8.
  • For JDK 1.7, CDH4.4 and later are certified with 1.7.0_15. CDH4.5 and later are certified with 1.7.0_55.

Supported Internet Protocol

CDH requires IPv4. IPv6 is not supported.