Your browser is out of date!

Update your browser to view this website correctly. Update my browser now

×

Long term component architecture

As the main curator of open standards in Hadoop, Cloudera has a track record of bringing new open source solutions into its platform (such as Apache Spark, Apache HBase, and Apache Parquet) that are eventually adopted by the community at large. As standards, you can build longterm architecture on these components with confidence.

Thank you for choosing CDH, your download instructions are below:

CDH 5.0.0 Packaging and Tarballs

To view the overall release notes for CDH 5, CDH 5 Release Notes.

Component

Package Version

Tarball

Release Notes

Changes File

Apache Avro

avro-1.7.5+cdh5.0.0+16

Tarball

Release notes

Changes

Apache Crunch

crunch-0.9.0+cdh5.0.0+21

Tarball

Release notes

Changes

DataFu

pig-udf-datafu-1.1.0+cdh5.0.0+7

Tarball

Release notes

Changes

Apache Flume

flume-ng-1.4.0+cdh5.0.0+111

Tarball

Release notes

Changes

Apache Hadoop

hadoop-2.3.0+cdh5.0.0+548

Tarball

Release notes

Changes

Apache HBase

hbase-0.96.1.1+cdh5.0.0+60

Tarball

Release notes

Changes

HBase-Solr

hbase-solr-1.3+cdh5.0.0+38

Tarball

Release notes

Changes

Apache Hive

hive-0.12.0+cdh5.0.0+308

Tarball

Release notes

Changes

Hue

hue-3.5.0+cdh5.0.0+365

Tarball

Release notes

Changes

Cloudera Impala

impala-1.3.0+cdh5.0.0+0

(none)

Release notes

Changes

Kite SDK

kite-0.10.0+cdh5.0.0+79

Tarball

Release notes

Changes

Llama

llama-1.0.0+cdh5.0.0+0

Tarball

Release notes

Changes

Apache Mahout

mahout-0.8+cdh5.0.0+27

Tarball

Release notes

Changes

Apache Oozie

oozie-4.0.0+cdh5.0.0+174

Tarball

Release notes

Changes

Parquet

parquet-1.2.5+cdh5.0.0+91

Tarball

Release notes

Changes

Parquet-format

parquet-format-1.0.0+cdh5.0.0+3

Tarball

Release notes

Changes

Apache Pig

pig-0.12.0+cdh5.0.0+27

Tarball

Release notes

Changes

Cloudera Search

search-1.0.0+cdh5.0.0+0

Tarball

Release notes

Changes

Apache Sentry (incubating)

sentry-1.2.0+cdh5.0.0+71

Tarball

Release notes

Changes

Apache Solr

solr-4.4.0+cdh5.0.0+178

Tarball

Release notes

Changes

Apache Spark

spark-0.9.0+cdh5.0.0+31

Tarball

Release notes

Changes

Apache Sqoop

sqoop-1.4.4+cdh5.0.0+43

Tarball

Release notes

Changes

Apache Sqoop2

sqoop2-1.99.3+cdh5.0.0+26

Tarball

Release notes

Changes

Apache Whirr

whirr-0.9.0+cdh5.0.0+4

Tarball

Release notes

Changes

Apache ZooKeeper

zookeeper-3.4.5+cdh5.0.0+28

Tarball

Release notes

Changes

Please Read and Accept our Terms

CDH 5 provides packages for Red-Hat-compatible, SLES, Ubuntu, and Debian systems as described below.

 

Operating System

Version

Packages

Red Hat compatible

 

 

Red Hat Enterprise Linux (RHEL)

5.7

64-bit

 

6.2

64-bit

 

6.4

64-bit

CentOS

5.7

64-bit

 

6.2

64-bit

 

6.4

64-bit

Oracle Linux with Unbreakable Enterprise Kernel

5.6

64-bit

 

6.4

64-bit

SLES

 

 

SLES Linux Enterprise Server (SLES)

11 with Service Pack 1 or later

64-bit

Ubuntu/Debian

 

 

Ubuntu

Precise (12.04) - Long-Term Support (LTS)

64-bit

Debian

Wheezy (7.0, 7.1)

64-bit

Note:

  • CDH 5 provides only 64-bit packages.
  • Cloudera has received reports that our RPMs work well on Fedora, but we have not tested this.
  • If you are using an operating system that is not supported by Cloudera's packages, you can also download source tarballs from Downloads.

 

Selected tab: SupportedOperatingSystems

Component

MySQL

SQLite

PostgreSQL

Oracle

Derby - see Note 4

Oozie

5.5

8.4

10.2, 11gR2

Default

Flume

Default (for the JDBC Channel only)

Hue

5.0+ See Note 1

Default

8.4

11gR2

Hive

5.5

8.4

10.2, 11gR2

Default

Sqoop 1

See Note 2

 –

See Note 2

See Note 2

Sqoop 2

See Note 3

 –

See Note 3

See Note 3

Default

Notes

  1. Cloudera's recommendations are:
    • For Red Hat and similar systems:
      • Use MySQL server version 5.0 (or higher) and version 5.0 client shared libraries on Red Hat 5 and similar systems.
      • Use MySQL server version 5.1 (or higher) and version 5.1 client shared libraries on Red Hat 6 and similar systems.

      If you use a higher server version than recommended here (for example, if you use 5.5) make sure you install the corresponding client libraries.

    • For SLES systems, use MySQL server version 5.0 (or higher) and version 5.0 client shared libraries.
    • For Ubuntu systems:
      • Use MySQL server version 5.5 (or higher) and version 5.0 client shared libraries on Precise (12.04).
  2. For connectivity purposes only, Sqoop 1 supports MySQL5.1, PostgreSQL 9.1.4, Oracle 10.2, Teradata 13.1, and Netezza TwinFin 5.0. The Sqoop metastore works only with HSQLDB (1.8.0 and higher 1.x versions; the metastore does not work with any HSQLDB 2.x versions).
  3. Sqoop 2 can transport data to and from MySQL5.1, PostgreSQL 9.1.4, Oracle 10.2, and Microsoft SQL Server 2012. The Sqoop 2 repository is supported only on Derby.
  4. Derby is supported as shown in the table, but not always recommended. See the pages for individual components in the CDH 5 Installation Guide for recommendations.
Selected tab: SupportedDatabases

CDH 5 is supported with Oracle JDK 1.7.

Table 1. Supported JDK 1.7 Versions
Latest Certified Version Minimum Supported Version Exceptions
1.7.0_45 1.7.0_25 None

Selected tab: SupportedJDKVersions
Selected tab: SystemRequirements

What's New in CDH 5.0.0

The following topics describe new features introduced in CDH 5.0.0.


Apache Hadoop

HDFS

New Features:

  • HDFS-5776- Hedged reads in HDFS for improved HBase MTTR.
  • HDFS-4685- Implementation of extended file access control lists in HDFS.

Notable Bug Fixes:

  • HDFS-5339 - WebHDFS URI does not accept logical nameservices when security is enabled.
  • HDFS-5898 - Allow NFS gateway to login/relogin from its Kerberos keytab.
  • HDFS-5921 - "Browse filesystem" on the Namenode UI doesn't work if any directory has the sticky bit set.
  • HDFS and Hive replication between different Kerberos realms now works.
  • HDFS-5922 - DataNode heartbeat thread can get stuck in a tight loop.

MapReduce & YARN

New Feature:

  • FairScheduler supports moving running applications between queries.

Notable Bug Fixes:

  • Several critical fixes to stabilize ResourceManager HA - Web UI, unmanaged ApplicationMasters and secure-cluster support.
  • Support for large values of mapreduce.task.io.sort.mb.
  • JobHistory Server has information on failed MapReduce jobs.

Apache HBase

New Features:

  • HBASE-10436- Restore RegionServer lists removed from HBase 0.96.0 JMX.

    Many of the metrics exposed in CDH 4/0.94 were removed with the refactorization of metrics in CDH 5/0.96. This patch restores the availability of the lists of live and dead RegionServers. In 0.94 this was a large nested structure as shown below, which included the RegionServer lists and metrics from each region.

    {
    "name" : "hadoop:service=Master,name=Master",
    "modelerType" : "org.apache.hadoop.hbase.master.MXBeanImpl",
    "ZookeeperQuorum" : "localhost:2181",
    ....
    "RegionsInTransition" : [ ],
    "RegionServers" : [ {
    "key" : "localhost,48346,1390857257246",
    "value" : {
    "load" : 2,
    ....

    CDH 5 Beta 1 and Beta 2 did not contain this list; they only displayed counts of the number of live and dead RegionServers. As of CDH 5.0.0, this list is now presented in a semi-colon separated field as follows:

    {
    "name" : "Hadoop:service=HBase,name=Master,sub=Server",
    "modelerType" : "Master,sub=Server",
    "tag.Context" : "master",
    "tag.liveRegionServers" : "localhost,56196,1391992019130",
    "tag.deadRegionServers" :
    "localhost,40010,1391035309673;localhost,41408,1391990380724;localhost,38682,1390950017735",
    ...
    }

  • Assorted usability and compatibility improvements as well as improvements to exporting snapshots.

Apache Flume

New Feature:

  • The HBase Sink now supports coalescing multiple Increment RPCs into one (FLUME-2338).

Changed Behavior:

  • File Channel Write timeout has been removed and the configuration parameter is now ignored (FLUME-2307).
  • Syslog UDP source can now accept larger messages (FLUME-2130).
  • AsyncHBase Sink is now fully functional (FLUME-2334).
  • Use standard lookup to find queue/topic in JMS Source (FLUME-2311).

Notable Bug Fixes:

  • Deadlock fixed in Dataset sink (FLUME-2320).
  • FileChannel Dual Checkpoint Backup Thread is now released on application stop (FLUME-2328).
  • Spool Dir source now checks interrupt flag before writing to channel (FLUME-2283).
  • Morphline sink increments eventDrainAttemptCount when it takes event from channel (FLUME-2323).
  • Bucketwriter now permanently closed only on idle and roll timeouts (FLUME-2325).
  • BucketWriter#close now cancels idleFuture (FLUME-2305).

Cloudera Search

Selected tab: WhatsNew

Want to Get Involved or Learn More?

Check out our other resources

Cloudera Community

Collaborate with your peers, industry experts, and Clouderans to make the most of your investment in Hadoop.

Cloudera University

Receive expert Hadoop training through Cloudera University, the industry's only truly dynamic Hadoop training curriculum that’s updated regularly to reflect the state of the art in big data.