Your browser is out of date

Update your browser to view this website correctly. Update my browser now

×

Please Read and Accept our Terms


Long term component architecture

As the main curator of open standards in Hadoop, Cloudera has a track record of bringing new open source solutions into its platform (such as Apache Spark, Apache HBase, and Apache Parquet) that are eventually adopted by the community at large. As standards, you can build longterm architecture on these components with confidence.

 

PLEASE NOTE:

With the exception of DSSD support, Cloudera Enterprise 5.6.0 is identical to CDH 5.5.2/Cloudera Manager 5.5.3  If you do not need DSSD support, you do not need to upgrade if you are already using the latest 5.5.x release.

 

Important: In order to be covered by Cloudera Support:

  • All CDH hosts in a logical cluster must run on the same major OS release.
  • Cloudera Manager must run on the same OS release as one of the CDH clusters it manages.

Cloudera recommends running the same minor release cross-cluster. However, the risk caused by running different minor OS releases is considered lower than the risk of running different major OS releases.

 

Gateway hosts may use RHEL/Centos 7.2, subject to some restrictions. See Operating System Support for Gateway Hosts (CDH 5.11 and higher only)

 

Other disclaimers:

  • RHEL / CentOS / OEL 7.0 is not supported.
  • Red Hat only supports specific upgrades from RHEL 6 to 7. Contact your OS vendor and review What are the supported use cases for upgrading to RHEL 7?
  • SLES hosts running Cloudera Manager agents must use SLES SDK 11 SP1.
  • Cloudera does not support CDH cluster deployments in Docker containers.
  • Cloudera Enterprise (without Cloudera Navigator) is supported on platforms with Security-Enhanced Linux (SELinux) enabled.

Important: Cloudera is not responsible for policy support nor policy enforcement. If you experience issues with SELinux, contact your OS support provider.

 

Operating System Version (bold=new)
Red Hat Enterprise Linux-compatible

RHEL / CentOS

Max SE Linux support: 7.2

7.3, 7.2, 7.1

6.8, 6.7, 6.6, 6.5, 6.4

5.11, 5.10, 5.7

Oracle Enterprise Linux (OEL)

7.3, 7.2, 7.1 (UEK default)

6.8, 6.7, 6.6 (UEK R3)

6.5 (UEK R2, UEK R3)

6.4 (UEK R2)

5.11, 5.10, 5.7 (UEK R2)

SUSE Linux Enterprise Server
SLES

12 SP2, 12 SP1

11 SP4, 11 SP3, 11 SP2

Ubuntu/Debian
Ubuntu

16.04 LTS (Xenial)

14.04 LTS (Trusty)

12.04 LTS (Precise)

Debian

8.2, 8.4 (Jessie)

7.0, 7.1, 7.8 (Wheezy)

 

Operating System Support for Gateway Hosts (CDH 5.11 and higher only)

A Gateway host that is dedicated to running Cloudera Data Science Workbench can use RHEL/CentOS 7.2 even if the remaining hosts in your cluster are running any of the other supported operating systems. All hosts must run the same version of the Oracle JDK.

Selected tab: SupportedOperatingSystems

Please see Cloudera Manager Supported Databases for a full list of supported databases for each version of Cloudera Manager.

 

Cloudera Manager and CDH come packaged with an embedded PostgreSQL database, but it is recommended that you configure your cluster with custom external databases, especially in production.

 

In most cases (but not all), Cloudera supports versions of MariaDB, MySQL and PostgreSQL that are native to each supported Linux distribution.

 

After installing a database, upgrade to the latest patch and apply appropriate updates. Available updates may be specific to the operating system on which it is installed.

 

Notes:

  • Use UTF8 encoding for all custom databases.
  • Cloudera Manager installation fails if GTID-based replication is enabled in MySQL.
  • Hue requires the default MySQL/MariaDB version (if used) of the operating system on which it is installed. See Hue Databases.
  • Both the Community and Enterprise versions of MySQL are supported, as well as MySQL configured by the AWS RDS service.

Important: When you restart processes, the configuration for each of the services is redeployed using information saved in the Cloudera Manager database. If this information is not available, your cluster does not start or function correctly. You must schedule and maintain regular backups of the Cloudera Manager database to recover the cluster in the event of the loss of this database.

Selected tab: SupportedDatabases

CDH and Cloudera Manager Supported JDK Versions

Only 64 bit JDKs from Oracle are supported. Oracle JDK 7 is supported across all versions of Cloudera Manager 5 and CDH 5. Oracle JDK 8 is supported in C5.3.x and higher.

 

A supported minor JDK release will remain supported throughout a Cloudera major release lifecycle, from the time of its addition forward, unless specifically excluded.

 

Warning: JDK 1.8u40 and JDK 1.8u60 are excluded from support. Also, the Oozie Web Console returns 500 error when Oozie server runs on JDK 8u75 or higher.

 

Running CDH nodes within the same cluster on different JDK releases is not supported. JDK release across a cluster needs to match the patch level.

  • All nodes in your cluster must run the same Oracle JDK version.
  • All services must be deployed on the same Oracle JDK version.

 

The Cloudera Manager repository is packaged with Oracle JDK 1.7.0_67 (for example) and can be automatically installed during a new installation or an upgrade.

 

For a full list of supported JDK Versions please see CDH and Cloudera Manager Supported JDK Versions.

Selected tab: SupportedJDKVersions

Hue

Hue works with the two most recent LTS (long term support) or ESR (extended support release) browsers. Cookies and JavaScript must be on.

Hue can display in older, and other, browsers, but you might not have access to all of its features.Important: To see all icons in the Hue Web UI, users with IE and HTTPS must add a Load Balancer.

Selected tab: SupportedBrowsers

CDH requires IPv4. IPv6 is not supported.

 

See also Configuring Network Names.

Multihoming CDH or Cloudera Manager is not supported outside specifically certified Cloudera partner appliances. Cloudera finds that current Hadoop architectures combined with modern network infrastructures and security practices remove the need for multihoming. Multihoming, however, is beneficial internally in appliance form factors to take advantage of high-bandwidth InfiniBand interconnects.

 

Although some subareas of the product may work with unsupported custom multihoming configurations, there are known issues with multihoming. In addition, unknown issues may arise because multihoming is not covered by our test matrix outside the Cloudera-certified partner appliances.

 

Selected tab: SupportedInternetProtocol

The following components are supported by the indicated versions of Transport Layer Security (TLS):

 

Components Supported by TLS

Component

Role Name Port Version
Cloudera Manager Cloudera Manager Server   7182 TLS 1.2
Cloudera Manager Cloudera Manager Server   7183 TLS 1.2
Flume     9099 TLS 1.2
Flume   Avro Source/Sink   TLS 1.2
Flume   Flume HTTP Source/Sink   TLS 1.2
HBase Master HBase Master Web UI Port 60010 TLS 1.2
HDFS NameNode Secure NameNode Web UI Port 50470 TLS 1.2
HDFS Secondary NameNode Secure Secondary NameNode Web UI Port 50495 TLS 1.2
HDFS HttpFS REST Port 14000 TLS 1.1, TLS 1.2
Hive HiveServer2 HiveServer2 Port 10000 TLS 1.2
Hue Hue Server Hue HTTP Port 8888 TLS 1.2
Impala Impala Daemon Impala Daemon Beeswax Port 21000 TLS 1.2
Impala Impala Daemon Impala Daemon HiveServer2 Port 21050 TLS 1.2
Impala Impala Daemon Impala Daemon Backend Port 22000 TLS 1.2
Impala Impala StateStore StateStore Service Port 24000 TLS 1.2
Impala Impala Daemon Impala Daemon HTTP Server Port 25000 TLS 1.2
Impala Impala StateStore StateStore HTTP Server Port 25010 TLS 1.2
Impala Impala Catalog Server Catalog Server HTTP Server Port 25020 TLS 1.2
Impala Impala Catalog Server Catalog Server Service Port 26000 TLS 1.2
Oozie Oozie Server Oozie HTTPS Port 11443 TLS 1.1, TLS 1.2
Solr Solr Server Solr HTTP Port 8983 TLS 1.1, TLS 1.2
Solr Solr Server Solr HTTPS Port 8985 TLS 1.1, TLS 1.2
Spark History Server   18080 TLS 1.2
YARN ResourceManager ResourceManager Web Application HTTP Port 8090 TLS 1.2
YARN JobHistory Server MRv1 JobHistory Web Application HTTP Port 19890 TLS 1.2

 

Selected tab: SupportedTransportLayerSecurityVersions
Selected tab: SystemRequirements

What's New In CDH 5.11.0

The following sections describe new features introduced in 5.11.0.

 

Apache Hadoop

  • Supported Apache Tomcat TLS ciphers for HttpFS are configurable using the HTTPFS_SSL_CIPHERSenvironment variable.
  • Supported Apache Tomcat TLS ciphers for the KMS are configurable using the KMS_SSL_CIPHERSenvironment variable.
  • Amazon S3 Consistency with Metadata Caching (S3Guard)

    Data written to Amazon S3 buckets is subject to the "eventual consistency" guarantee provided by Amazon Web Services (AWS), which means that data written to S3 may not be immediately available for queries and listing operations. This can cause failures in multi-step ETL workflows, where data from a previous step is not available to the next step. To mitigate these consistency issues you can now configure metadata caching for data stored in Amazon S3 using S3Guard. S3Guard requires that you provision a DynamoDB database from Amazon Web Services and configure S3Guard using the Cloudera Manager Admin Console or command-line tools. See Configuring and Managing S3Guard.

  • Amazon S3 Server-side Encryption with SSE-KMS

    Clusters that use Amazon S3 storage can now use Amazon Server-Side Encryption with AWS KMS–Managed Keys (SSE-KMS) to encrypt data, so you now have two choices for data-at-rest encryption on Amazon S3 (SSE-S3, SSE-KMS). Use Cloudera Manager Admin Console to configure the cluster to use this new feature as detailed in How to Configure Encryption for Amazon S3.

 

Apache Hive

 

Hue

  • Integrate Navigator with Hue: Phase 1, Metadata Discovery

    • Search and tag partitions, databases, views, tables, columns.
    • Off by default. Check both "Enable" fields in Hue > Configuration > Cloudera Navigator.
    • See How to Enable and Use Navigator in Hue.
  • Embed new create table wizard within Editor and Assist

    • Safely import multiple formats such as Kudu, Parquet, JSON, and CSV.
    • More easily create table partitions.
  • Continued SQL improvements
    • Visually more pleasant colors and text.
    • No more hanging spinner in the Editor.
  • HUE-5742: Allow non-public PostgreSQL schemas.

  • HUE-5608: Add ability to DESC table without TABLE level privilege

 

Apache Impala (incubating)

See What's New in Apache Impala (incubating).

 

Apache Oozie

  • Supported TLS ciphers for Apache Tomcat are configurable using the OOZIE_HTTPS_CIPHERS environment variable.

 

Apache Spark

Blacklisting. This feature reduces the chance of application failure, by not scheduling work on hosts that are experiencing intermittent disk failures. See this blog post for background information.

You can enable Kerberos authentication and TLS/SSL encryption for the Spark History Server through Cloudera Manager configuration settings, rather than including the password in clear text in an Advanced Configuration Snippet field. See these settings in the Cloudera Manager user interface:

  • history_server_spnego_enabled - for Kerberos authentication
  • history_server_admin_users
  • spark.ssl.historyServer.enabled
  • spark.ssl.historyServer.protocol
  • spark.ssl.historyServer.port
  • spark.ssl.historyServer.enabledAlgorithms
  • spark.ssl.historyServer.keyStore
  • spark.ssl.historyServer.keyStorePassword

With authentication enabled, only Kerberos-authorized users can read data from the Spark History Server, and non-admin users can only see information about their own jobs.

With TLS/SSL enabled, you provide the location of the keystore and its password, similar to the security configuration for other components.

Navigator lineage. The former Spark lineage extractor that was enabled through a safety valve is superceded by a more robust lineage collection mechanism. See Apache Spark Known Issues for some limitations and restrictions with this feature.

Support for Azure Data Lake Store (ADLS) as a secondary filesystem. You can use Spark jobs to read and write data stored on ADLS. Hive-on-Spark and Spark with Kudu are not currently supported for ADLS data.

 

Cloudera Search

  • Supported TLS ciphers for Apache Tomcat are configurable using the SOLR_CIPHERS_CONFIG environment variable.

 

ZooKeeper

Server-Server Mutual Authentication

All ZooKeeper servers in an ensemble can now be configured to support quorum peer (server-server) mutual authentication, mitigating risk of spoofing by a rogue server on an unsecured network. The feature leverages Kerberos authentication through the SASL framework, so Kerberos is required.

This feature is easy to enable using Cloudera Manager Admin Console. See Enabling Server-Server Mutual Authentication in the ZooKeeper Authentication page of Cloudera Security for details.

Selected tab: WhatsNew
 
 
 
Selected tab: Documentation

Want to Get Involved or Learn More?

Check out our other resources

Cloudera Community

Collaborate with your peers, industry experts, and Clouderans to make the most of your investment in Hadoop.

Cloudera University

Receive expert Hadoop training through Cloudera University, the industry's only truly dynamic Hadoop training curriculum that’s updated regularly to reflect the state of the art in big data.