Your browser is out of date!

Update your browser to view this website correctly. Update my browser now

×

Please Read and Accept our Terms


Long term component architecture

As the main curator of open standards in Hadoop, Cloudera has a track record of bringing new open source solutions into its platform (such as Apache Spark, Apache HBase, and Apache Parquet) that are eventually adopted by the community at large. As standards, you can build longterm architecture on these components with confidence.

 

PLEASE NOTE:

With the exception of DSSD support, Cloudera Enterprise 5.6.0 is identical to CDH 5.5.2/Cloudera Manager 5.5.3  If you do not need DSSD support, you do not need to upgrade if you are already using the latest 5.5.x release.

 

CDH 5 provides packages for Red-Hat-compatible, SLES, Ubuntu, and Debian systems as described below.

Operating System Version Packages
Red Hat Enterprise Linux (RHEL)-compatible
Red Hat Enterprise Linux 5.7 64-bit
  5.10 64-bit
  6.4 64-bit
  6.5 64-bit
  6.5 in SE Linux mode 64-bit
  6.6 64-bit
CentOS 5.7 64-bit
  5.10 64-bit
  6.4 64-bit
  6.5 64-bit
  6.5 in SE Linux mode 64-bit
  6.6 64-bit
Oracle Linux with default kernel and Unbreakable Enterprise Kernel 5.6 (UEK R2) 64-bit
  6.4 (UEK R2) 64-bit
  6.5 (UEK R2, UEK R3) 64-bit
  6.6 (UEK R3) 64-bit
SLES
SUSE Linux Enterprise Server (SLES) 11 with Service Pack 2 64-bit
SUSE Linux Enterprise Server (SLES) 11 with Service Pack 3 64-bit
Ubuntu/Debian
Ubuntu Precise (12.04) - Long-Term Support (LTS) 64-bit
  Trusty (14.04) - Long-Term Support (LTS) 64-bit
Debian Wheezy (7.0) 64-bit

Note:

  • CDH 5 provides only 64-bit packages.
  • Cloudera has received reports that our RPMs work well on Fedora, but we have not tested this.
  • If you are using an operating system that is not supported by Cloudera packages, you can also download source tarballs from Downloads.
Selected tab: SupportedOperatingSystems
Component MySQL SQLite PostgreSQL Oracle Derby - see Note 4
Oozie 5.5, 5.6 - 8.4, 9.2, 9.3

See Note 2

11gR2 Default
Flume - - - - Default (for the JDBC Channel only)
Hue 5.5, 5.6

See Note 1

Default 8.4, 9.2, 9.3

See Note 2

11gR2 -
Hive/Impala 5.5, 5.6

See Note 1

- 8.4, 9.2, 9.3

See Note 2

11gR2 Default
Sentry 5.5, 5.6

See Note 1

- 8.4, 9.2, 9.3

See Note 2

11gR2 -
Sqoop 1 See Note 3 - See Note 3 See Note 3 -
Sqoop 2 See Note 4 - See Note 4 See Note 4 Default

Note:

  1. MySQL 5.5 is supported on CDH 5.1. MySQL 5.6 is supported on CDH 5.1 and later. The InnoDB storage engine must be enabled in the MySQL server.
  2. PostgreSQL 9.2 is supported on CDH 5.1 and later. PostgreSQL 9.3 is supported on CDH 5.2 and later.
  3. For the purposes of transferring data only, Sqoop 1 supports MySQL 5.0 and above, PostgreSQL 8.4 and above, Oracle 10.2 and above, Teradata 13.10 and above, and Netezza TwinFin 5.0 and above. The Sqoop metastore works only with HSQLDB (1.8.0 and higher 1.x versions; the metastore does not work with any HSQLDB 2.x versions).
  4. Sqoop 2 can transfer data to and from MySQL 5.0 and above, PostgreSQL 8.4 and above, Oracle 10.2 and above, and Microsoft SQL Server 2012 and above. The Sqoop 2 repository database is supported only on Derby and PostgreSQL.
  5. Derby is supported as shown in the table, but not always recommended. See the pages for individual components in the Cloudera Installation and Upgrade guide for recommendations.
Selected tab: SupportedDatabases
CDH 5.4.x is supported with the versions shown in the following table:
Minimum Supported Version Recommended Version Notes
1.7.0_55 1.7.0_67 or 1.7.0_75 None
1.8.0_60 1.8.0_60 None
Selected tab: SupportedJDKVersions

CDH requires IPv4. IPv6 is not supported.

See also Configuring Network Names.

Selected tab: SupportedInternetProtocol
Selected tab: SystemRequirements

What's New in CDH 5.4.1

Cloudera Search

  • Beginning with CDH 5.4.1, Search for CDH supports configurable transaction log replication levels for replication logs stored in HDFS.

    Configure the replication factor by modifying the tlogDfsReplication setting in solrconfig.xml. The tlogDfsReplication is a new setting in the updateLog settings area. An excerpt of the solrconfig.xml file where the transaction log replication factor is set is as follows:

    <updateHandler class="solr.DirectUpdateHandler2">

    <!-- Enables a transaction log, used for real-time get, durability, and
    and solr cloud replica recovery. The log can grow as big as
    uncommitted changes to the index, so use of a hard autoCommit
    is recommended (see below).
    "dir" - the target directory for transaction logs, defaults to the
    solr data directory. -->
    <updateLog>
    <str name="dir">${solr.ulog.dir:}</str>
    <int name="tlogDfsReplication">3</int>
    </updateLog>

You might want to increase the replication level from the default level of 1 to some higher value such as 3. Increasing the transaction log replication level can:

  • Reduce the chance of data loss, especially when the system is otherwise configured to have single replicas of shards. For example, having single replicas of shards is reasonable when autoAddReplicas is enabled, but without additional transaction log replicas, the risk of data loss during a node failure would increase.
  • Facilitate rolling upgrade of HDFS while Search is running. If you have multiple copies of the log, when a node with the transaction log becomes unavailable during the rolling upgrade process, another copy of the log can continue to collect transactions.
  • Facilitate HDFS write lease recovery.

Initial testing shows no significant performance regression for common use cases.

Selected tab: WhatsNew

Want to Get Involved or Learn More?

Check out our other resources

Cloudera Community

Collaborate with your peers, industry experts, and Clouderans to make the most of your investment in Hadoop.

Cloudera University

Receive expert Hadoop training through Cloudera University, the industry's only truly dynamic Hadoop training curriculum that’s updated regularly to reflect the state of the art in big data.