Download CDH 5.3.3
Your browser is out of date

Update your browser to view this website correctly. Update my browser now


Please Read and Accept our Terms

Long term component architecture

As the main curator of open standards in Hadoop, Cloudera has a track record of bringing new open source solutions into its platform (such as Apache Spark, Apache HBase, and Apache Parquet) that are eventually adopted by the community at large. As standards, you can build longterm architecture on these components with confidence.



With the exception of DSSD support, Cloudera Enterprise 5.6.0 is identical to CDH 5.5.2/Cloudera Manager 5.5.3  If you do not need DSSD support, you do not need to upgrade if you are already using the latest 5.5.x release.


CDH 5 provides packages for Red-Hat-compatible, SLES, Ubuntu, and Debian systems as described below.

Operating System Version Packages
Red Hat Enterprise Linux (RHEL)-compatible
Red Hat Enterprise Linux 5.7 64-bit
  6.2 64-bit
  6.4 64-bit
  6.4 in SE Linux mode 64-bit
  6.5 64-bit
CentOS 5.7 64-bit
  6.2 64-bit
  6.4 64-bit
  6.4 in SE Linux mode 64-bit
  6.5 64-bit
Oracle Linux with default kernel and Unbreakable Enterprise Kernel 5.6 (UEK R2) 64-bit
  6.4 (UEK R2) 64-bit
  6.5 (UEK R2, UEK R3) 64-bit
SLES Linux Enterprise Server (SLES) 11 with Service Pack 2 or later 64-bit
Ubuntu Precise (12.04) - Long-Term Support (LTS) 64-bit
  Trusty (14.04) - Long-Term Support (LTS) 64-bit
Debian Wheezy (7.0, 7.1) 64-bit


  • CDH 5 provides only 64-bit packages.
  • Cloudera has received reports that our RPMs work well on Fedora, but we have not tested this.
  • If you are using an operating system that is not supported by Cloudera packages, you can also download source tarballs from Downloads.


Selected tab: supportedoperatingsystems
Component MySQL SQLite PostgreSQL Oracle Derby - see Note 4
Oozie 5.5, 5.6 - 8.4, 9.1, 9.2, 9.3

See Note 2

11gR2 Default
Flume - - - - Default (for the JDBC Channel only)
Hue 5.5, 5.6

See Note 1

Default 8.4, 9.1, 9.2, 9.3

See Note 2

11gR2 -
Hive/Impala 5.5, 5.6

See Note 1

- 8.4, 9.1, 9.2, 9.3

See Note 2

11gR2 Default
Sentry 5.5, 5.6

See Note 1

- 8.4, 9.1, 9.2,, 9.3

See Note 2

11gR2 -
Sqoop 1 See Note 3 - See Note 3 See Note 3 -
Sqoop 2 See Note 4 - See Note 4 See Note 4 Default


  1. MySQL 5.5 is supported on CDH 5.1. MySQL 5.6 is supported on CDH 5.1 and later.
  2. PostgreSQL 9.2 is supported on CDH 5.1 and later. PostgreSQL 9.3 is supported on CDH 5.2 and later.
  3. For the purposes of transferring data only, Sqoop 1 supports MySQL 5.0 and above, PostgreSQL 8.4 and above, Oracle 10.2 and above, Teradata 13.10 and above, and Netezza TwinFin 5.0 and above. The Sqoop metastore works only with HSQLDB (1.8.0 and higher 1.x versions; the metastore does not work with any HSQLDB 2.x versions).
  4. Sqoop 2 can transfer data to and from MySQL 5.0 and above, PostgreSQL 8.4 and above, Oracle 10.2 and above, and Microsoft SQL Server 2012 and above. The Sqoop 2 repository database is supported only on Derby.
  5. Derby is supported as shown in the table, but not always recommended. See the pages for individual components in the Cloudera Installation and Upgrade guide for recommendations.
Selected tab: supporteddatabases

CDH 5 is supported with the versions shown in the table that follows.

Table 1. Supported JDK Versions

Latest Certified Version Minimum Supported Version Exceptions
1.7.0_67 1.7.0_67 None
1.8.0_11 1.8.0_11 None

Selected tab: supportedjdkversions

CDH requires IPv4. IPv6 is not supported.

See also Configuring Network Names.

Selected tab: supportedinternetprotocol
Selected tab: systemrequirements

Known Issues Fixed in CDH 5.3.3


Upstream Issues Fixed


The following upstream issues are fixed in CDH 5.3.3:

  • HADOOP-11722 - Some Instances of Services using ZKDelegationTokenSecretManager go down when old token cannot be deleted
  • HADOOP-11469 - KMS should skip default.key.acl and whitelist.key.acl when loading key acl
  • HADOOP-11710 - Make CryptoOutputStream behave like DFSOutputStream wrt synchronization
  • HADOOP-11674 - oneByteBuf in CryptoInputStream and CryptoOutputStream should be non static
  • HADOOP-11445 - Bzip2Codec: Data block is skipped when position of newly created stream is equal to start of split
  • HADOOP-11620 - Add support for load balancing across a group of KMS for HA
  • HDFS-6830 - BlockInfo.addStorage fails when DN changes the storage for a block replica
  • HDFS-7961 - Trigger full block report after hot swapping disk
  • HDFS-7960 - The full block report should prune zombie storages even if they're not empty
  • HDFS-7575 - Upgrade should generate a unique storage ID for each volume
  • HDFS-7596 - NameNode should prune dead storages from storageMap
  • HDFS-7579 - Improve log reporting during block report rpc failure
  • HDFS-7208 - NN doesn't schedule replication when a DN storage fails
  • HDFS-6899 - Allow changing MiniDFSCluster volumes per DN and capacity per volume
  • HDFS-6878 - Change MiniDFSCluster to support StorageType configuration for individual directories
  • HDFS-6678 - MiniDFSCluster may still be partially running after initialization fails.
  • HDFS-7961 - Trigger full block report after hot swapping disk
  • HDFS-7960 - The full block report should prune zombie storages even if they're not empty
  • YARN-3351 - AppMaster tracking URL is broken in HA
  • YARN-3242 - Asynchrony in ZK-close can lead to ZKRMStateStore watcher receiving events for old client
  • YARN-2865 - Application recovery continuously fails with "Application with id already present. Cannot duplicate"
  • MAPREDUCE-6275 - Race condition in FileOutputCommitter v2 for user-specified task output subdirs
  • MAPREDUCE-4815 - Speed up FileOutputCommitter#commitJob for many output files
  • HBASE-13131 - ReplicationAdmin leaks connections if there's an error in the constructor
  • HIVE-10086 - Hive throws error when accessing Parquet file schema using field name match
  • HIVE-10098 - HS2 local task for map join fails in KMS encrypted cluster
  • HIVE-7426 - ClassCastException: ...IntWritable cannot be cast to ...Text involving ql.udf.generic.GenericUDFBasePad.evaluate
  • HIVE-7737 - Hive logs full exception for table not found
  • HIVE-9749 - ObjectStore schema verification logic is incorrect
  • HIVE-9788 - Make double quote optional in tsv/csv/dsv output
  • HIVE-9755 - Hive built-in "ngram" UDAF fails when a mapper has no matches.
  • HIVE-9770 - Beeline ignores --showHeader for non-tablular output formats i.e csv,tsv,dsv
  • HIVE-8688 - serialized plan OutputStream is not being closed
  • HIVE-9716 - Map job fails when table's LOCATION does not have scheme
  • HIVE-5857 - Reduce tasks do not work in uber mode in YARN
  • HIVE-8938 - Compiler should save the transform URI as input entity
  • HUE-2569 - [home] Delete project is broken
  • HUE-2529 - Increase the character limit of 'Name' Textfield in Useradmin Ldap Sync Groups
  • HUE-2506 - [search] Marker map does not display with HTML widget
  • HUE-1663 - [core] Option to either follow or not LDAP referrals for auth
  • HUE-2198 - [core] Reduce noise such as "handle_other(): Mutual authentication unavailable on 200 response"
  • SENTRY-683 - HDFS service client should ensure the kerberos ticket validity before new service connection
  • SENTRY-654 - Calls to append_partition fail when Sentry is enabled
  • SENTRY-664 - After Namenode is restarted, Path updates remain unsynched
  • SENTRY-665 - PathsUpdate.parsePath needs to handle special characters
  • SENTRY-652 - Sentry fails to parse spaces when HDFS ACL sync enabled
  • SOLR-7092 - Stop the HDFS lease recovery retries on HdfsTransactionLog on close and try to avoid lease recovery on closed files.
  • SOLR-7141 - RecoveryStrategy: Raise time that we wait for any updates from the leader before they saw the recovery state to have finished.
  • SOLR-7113 - Multiple calls to UpdateLog#init is not thread safe with respect to the HDFS FileSystem client object usage.
  • SOLR-7134 - Replication can still cause index corruption.
  • SQOOP-1764 - Numeric Overflow when getting extent map
  • IMPALA-1658 - Add compatibility flag for Hive-Parquet-Timestamps
  • IMPALA-1820 - Start with small pages for hash tables during repartitioning
  • IMPALA-1897 - Fixes for old hash join and agg
  • IMPALA-1894 - Fix old aggregation node hash table cleanup
  • IMPALA-1863 - Avoid deadlock across fragment instances
  • IMPALA-1915 - Fix query hang in BufferedBlockMgr:FindBlock()
  • IMPALA-1890 - Fixing a race between ~BufferedBlockMgr() and the WriteComplete() call
  • IMPALA-1738 - Use snprintf() instead of lexical_cast() in float-to-string casts
  • IMPALA-1865 - Fix partition spilling cleanup when new stream OOMs
  • IMPALA-1835 - Keep the fragment alive for TransmitData()
  • IMPALA-1805 - Impala's ACLs check do not consider all group ACLs, only checked first one.
  • IMPALA-1794 - Fix infinite loop opening or closing file with invalid metadata
  • IMPALA-1801 - external-data-source-executor leaking global jni refs
  • IMPALA-1712 - Unexpected remote bytes read counter was not being reset properly
  • IMPALA-1636 - Generalize index-based partition pruning to allow constant expressions


Published Known Issues Fixed


As a result of the above fixes, the following issues, previously published as Known Issues in CDH 5, are also fixed.


After upgrade from a release earlier than CDH 5.2.0, storage IDs may no longer be unique


As of CDH 5.2, each storage volume on a DataNode should have its own unique storageID, but in clusters upgraded from CDH 4, or CDH 5 releases earlier than CDH 5.2.0, each volume on a given DataNode shares the same storageID, because the HDFS upgrade does not properly update the IDs to reflect the new naming scheme. This causes problems with load balancing. The problem affects only clusters upgraded from CDH 5.1.x and earlier to CDH 5.2 or later. Clusters that are new as of CDH 5.2.0 or later do not have the problem.

Bug: HDFS-7575

Severity: Medium

Workaround: Upgrade to a later or patched version of CDH.

Selected tab: whatsnew

Want to Get Involved or Learn More?

Check out our other resources

Cloudera Community

Collaborate with your peers, industry experts, and Clouderans to make the most of your investment in Hadoop.

Cloudera Educational Services

Receive expert Hadoop training through Cloudera Educational Services, the industry's only truly dynamic Hadoop training curriculum that’s updated regularly to reflect the state of the art in big data.