CDH 5.0.0

Cloudera’s 100% Open Source Hadoop Platform

CDH 5.0.0 Packaging and Tarballs

To view the overall release notes for CDH 5, CDH 5 Release Notes.

Component

Package Version

Tarball

Release Notes

Changes File

Apache Avro

avro-1.7.5+cdh5.0.0+16

Tarball

Release notes

Changes

Apache Crunch

crunch-0.9.0+cdh5.0.0+21

Tarball

Release notes

Changes

DataFu

pig-udf-datafu-1.1.0+cdh5.0.0+7

Tarball

Release notes

Changes

Apache Flume

flume-ng-1.4.0+cdh5.0.0+111

Tarball

Release notes

Changes

Apache Hadoop

hadoop-2.3.0+cdh5.0.0+548

Tarball

Release notes

Changes

Apache HBase

hbase-0.96.1.1+cdh5.0.0+60

Tarball

Release notes

Changes

HBase-Solr

hbase-solr-1.3+cdh5.0.0+38

Tarball

Release notes

Changes

Apache Hive

hive-0.12.0+cdh5.0.0+308

Tarball

Release notes

Changes

Hue

hue-3.5.0+cdh5.0.0+365

Tarball

Release notes

Changes

Cloudera Impala

impala-1.3.0+cdh5.0.0+0

(none)

Release notes

Changes

Kite SDK

kite-0.10.0+cdh5.0.0+79

Tarball

Release notes

Changes

Llama

llama-1.0.0+cdh5.0.0+0

Tarball

Release notes

Changes

Apache Mahout

mahout-0.8+cdh5.0.0+27

Tarball

Release notes

Changes

Apache Oozie

oozie-4.0.0+cdh5.0.0+174

Tarball

Release notes

Changes

Parquet

parquet-1.2.5+cdh5.0.0+91

Tarball

Release notes

Changes

Parquet-format

parquet-format-1.0.0+cdh5.0.0+3

Tarball

Release notes

Changes

Apache Pig

pig-0.12.0+cdh5.0.0+27

Tarball

Release notes

Changes

Cloudera Search

search-1.0.0+cdh5.0.0+0

Tarball

Release notes

Changes

Apache Sentry (incubating)

sentry-1.2.0+cdh5.0.0+71

Tarball

Release notes

Changes

Apache Solr

solr-4.4.0+cdh5.0.0+178

Tarball

Release notes

Changes

Apache Spark

spark-0.9.0+cdh5.0.0+31

Tarball

Release notes

Changes

Apache Sqoop

sqoop-1.4.4+cdh5.0.0+43

Tarball

Release notes

Changes

Apache Sqoop2

sqoop2-1.99.3+cdh5.0.0+26

Tarball

Release notes

Changes

Apache Whirr

whirr-0.9.0+cdh5.0.0+4

Tarball

Release notes

Changes

Apache ZooKeeper

zookeeper-3.4.5+cdh5.0.0+28

Tarball

Release notes

Changes

What's New in CDH 5.0.0

The following topics describe new features introduced in CDH 5.0.0.


Apache Hadoop

HDFS

New Features:
  • HDFS-5776- Hedged reads in HDFS for improved HBase MTTR.
  • HDFS-4685- Implementation of extended file access control lists in HDFS.
Notable Bug Fixes:
  • HDFS-5339 - WebHDFS URI does not accept logical nameservices when security is enabled.
  • HDFS-5898 - Allow NFS gateway to login/relogin from its Kerberos keytab.
  • HDFS-5921 - "Browse filesystem" on the Namenode UI doesn't work if any directory has the sticky bit set.
  • HDFS and Hive replication between different Kerberos realms now works.
  • HDFS-5922 - DataNode heartbeat thread can get stuck in a tight loop.

MapReduce & YARN

New Feature:
  • FairScheduler supports moving running applications between queries.
Notable Bug Fixes:
  • Several critical fixes to stabilize ResourceManager HA - Web UI, unmanaged ApplicationMasters and secure-cluster support.
  • Support for large values of mapreduce.task.io.sort.mb.
  • JobHistory Server has information on failed MapReduce jobs.

Apache HBase

New Features:
  • HBASE-10436- Restore RegionServer lists removed from HBase 0.96.0 JMX.

    Many of the metrics exposed in CDH 4/0.94 were removed with the refactorization of metrics in CDH 5/0.96. This patch restores the availability of the lists of live and dead RegionServers. In 0.94 this was a large nested structure as shown below, which included the RegionServer lists and metrics from each region.

    {     
        "name" : "hadoop:service=Master,name=Master",     
        "modelerType" : "org.apache.hadoop.hbase.master.MXBeanImpl",     
        "ZookeeperQuorum" : "localhost:2181",   
    ....    
        "RegionsInTransition" : [ ],     
          "RegionServers" : [ {       
            "key" : "localhost,48346,1390857257246",       
            "value" : {         
              "load" : 2, 
    ....

    CDH 5 Beta 1 and Beta 2 did not contain this list; they only displayed counts of the number of live and dead RegionServers. As of CDH 5.0.0, this list is now presented in a semi-colon separated field as follows:

    {     
        "name" : "Hadoop:service=HBase,name=Master,sub=Server",     
        "modelerType" : "Master,sub=Server",     
        "tag.Context" : "master",     
        "tag.liveRegionServers" : "localhost,56196,1391992019130",     
        "tag.deadRegionServers" :
        "localhost,40010,1391035309673;localhost,41408,1391990380724;localhost,38682,1390950017735",     
        ... 
    }
  • Assorted usability and compatibility improvements as well as improvements to exporting snapshots.

Apache Flume

New Feature:
  • The HBase Sink now supports coalescing multiple Increment RPCs into one (FLUME-2338).
Changed Behavior:
  • File Channel Write timeout has been removed and the configuration parameter is now ignored (FLUME-2307).
  • Syslog UDP source can now accept larger messages (FLUME-2130).
  • AsyncHBase Sink is now fully functional (FLUME-2334).
  • Use standard lookup to find queue/topic in JMS Source (FLUME-2311).
Notable Bug Fixes:
  • Deadlock fixed in Dataset sink (FLUME-2320).
  • FileChannel Dual Checkpoint Backup Thread is now released on application stop (FLUME-2328).
  • Spool Dir source now checks interrupt flag before writing to channel (FLUME-2283).
  • Morphline sink increments eventDrainAttemptCount when it takes event from channel (FLUME-2323).
  • Bucketwriter now permanently closed only on idle and roll timeouts (FLUME-2325).
  • BucketWriter#close now cancels idleFuture (FLUME-2305).

Cloudera Search

CDH 5.x System Requirements:

Supported Operating Systems

CDH 5 provides packages for Red-Hat-compatible, SLES, Ubuntu, and Debian systems as described below.


Operating System

Version

Packages

Red Hat compatible



Red Hat Enterprise Linux (RHEL)

5.7

64-bit

6.2

64-bit

6.4

64-bit

CentOS

5.7

64-bit


6.2

64-bit


6.4

64-bit

Oracle Linux with Unbreakable Enterprise Kernel

5.6

64-bit

6.4

64-bit

SLES



SLES Linux Enterprise Server (SLES)

11 with Service Pack 1 or later

64-bit

Ubuntu/Debian



Ubuntu

Precise (12.04) - Long-Term Support (LTS)

64-bit

Debian

Wheezy (7.0, 7.1)

64-bit


  Note:
  • CDH 5 provides only 64-bit packages.
  • Cloudera has received reports that our RPMs work well on Fedora, but we have not tested this.
  • If you are using an operating system that is not supported by Cloudera's packages, you can also download source tarballs from Downloads.

Supported JDK Versions

CDH 5 is supported with Oracle JDK 1.7.

Table 1. Supported JDK 1.7 Versions
Latest Certified Version Minimum Supported Version Exceptions
1.7.0_45 1.7.0_25 None

Supported Databases

Component

MySQL

SQLite

PostgreSQL

Oracle

Derby - see Note 4

Oozie

5.5

8.4

10.2, 11gR2

Default

Flume

Default (for the JDBC Channel only)

Hue

5.0+ See Note 1

Default

8.4

11gR2

Hive

5.5

8.4

10.2, 11gR2

Default

Sqoop 1

See Note 2

 –

See Note 2

See Note 2

Sqoop 2

See Note 3

 –

See Note 3

See Note 3

Default

Notes

  1. Cloudera's recommendations are:
    • For Red Hat and similar systems:
      • Use MySQL server version 5.0 (or higher) and version 5.0 client shared libraries on Red Hat 5 and similar systems.
      • Use MySQL server version 5.1 (or higher) and version 5.1 client shared libraries on Red Hat 6 and similar systems.

      If you use a higher server version than recommended here (for example, if you use 5.5) make sure you install the corresponding client libraries.

    • For SLES systems, use MySQL server version 5.0 (or higher) and version 5.0 client shared libraries.
    • For Ubuntu systems:
      • Use MySQL server version 5.5 (or higher) and version 5.0 client shared libraries on Precise (12.04).
  2. For connectivity purposes only, Sqoop 1 supports MySQL5.1, PostgreSQL 9.1.4, Oracle 10.2, Teradata 13.1, and Netezza TwinFin 5.0. The Sqoop metastore works only with HSQLDB (1.8.0 and higher 1.x versions; the metastore does not work with any HSQLDB 2.x versions).
  3. Sqoop 2 can transport data to and from MySQL5.1, PostgreSQL 9.1.4, Oracle 10.2, and Microsoft SQL Server 2012. The Sqoop 2 repository is supported only on Derby.
  4. Derby is supported as shown in the table, but not always recommended. See the pages for individual components in the CDH 5 Installation Guide for recommendations.