Your browser is out of date!

Update your browser to view this website correctly. Update my browser now

×

Please Read and Accept our Terms

Long term component architecture

As the main curator of open standards in Hadoop, Cloudera has a track record of bringing new open source solutions into its platform (such as Apache Spark, Apache HBase, and Apache Parquet) that are eventually adopted by the community at large. As standards, you can build longterm architecture on these components with confidence.

 

PLEASE NOTE:

With the exception of DSSD support, Cloudera Enterprise 5.6.0 is identical to CDH 5.5.2/Cloudera Manager 5.5.3  If you do not need DSSD support, you do not need to upgrade if you are already using the latest 5.5.x release.

 

Note: Mixed operating system type and version clusters are supported, however using the same version of the same operating system on all cluster hosts is strongly recommended.

CDH 5 provides 64-bit packages for RHEL-compatible, SLES, Ubuntu, and Debian systems as listed below.

 

Operating System Version Packages
Red Hat Enterprise Linux (RHEL)-compatible
RHEL (+ SELinux mode in available versions) 5.7 64-bit
  5.10 64-bit
  6.4 64-bit
  6.5 64-bit
  6.6 64-bit
  6.7 64-bit
  7.1 64-bit
  7.2 64-bit
CentOS (+ SELinux mode in available versions) 5.7 64-bit
  5.10 64-bit
  6.4 64-bit
  6.5 64-bit
  6.6 64-bit
  6.7 64-bit
  7.1 64-bit
  7.2 64-bit
Oracle Linux with default kernel and Unbreakable Enterprise Kernel 5.7 (UEK R2) 64-bit
  5.10 64-bit
  5.11 64-bit
  6.4 (UEK R2) 64-bit
  6.5 (UEK R2, UEK R3) 64-bit
  6.6 (UEK R3) 64-bit
  6.7 (UEK R3) 64-bit
  7.1 64-bit
  7.2 64-bit
SLES
SUSE Linux Enterprise Server (SLES) 11 with Service Pack 2 64-bit
SUSE Linux Enterprise Server (SLES) 11 with Service Pack 3 64-bit
SUSE Linux Enterprise Server (SLES) 11 with Service Pack 4 64-bit
Ubuntu/Debian
Ubuntu Precise 12.04 - Long-Term Support (LTS) 64-bit
  Trusty 14.04 - Long-Term Support (LTS) 64-bit
Debian Wheezy 7.0, 7.1, and 7.8 64-bit
 
Important: Cloudera supports RHEL 7 with the following limitations:
 
Note:
  • Cloudera Enterprise is supported on platforms with Security-Enhanced Linux (SELinux) enabled. Cloudera is not responsible for policy support nor policy enforcement. If you experience issues with SELinux, contact your OS provider.
  • CDH 5.7 DataNode hosts with EMC® DSSD™ D5™ are supported by RHEL 6.6, 7.1, and 7.2. CDH 5.6 DataNode hosts with EMC® DSSD™ D5™ are only supported by RHEL 6.6.
Selected tab: SupportedOperatingSystems
Component MariaDB MySQL SQLite PostgreSQL Oracle Derby - see Note 5
Oozie 5.5 5.1, 5.5, 5.6, 5.7 8.1, 8.3, 8.4, 9.1, 9.2, 9.3, 9.4

See Note 3

11gR2, 12c Default
Flume Default (for the JDBC Channel only)
Hue 5.5 5.1, 5.5, 5.6, 5.7

See Note 6

Default 8.1, 8.3, 8.4, 9.1, 9.2, 9.3, 9.4

See Note 3

11gR2, 12c
Hive/Impala 5.5 5.1, 5.5, 5.6, 5.7

See Note 1

8.1, 8.3, 8.4, 9.1, 9.2, 9.3, 9.4

See Note 3

11gR2, 12c Default
Sentry 5.5 5.1, 5.5, 5.6, 5.7

See Note 1

8.1, 8.3, 8.4, 9.1, 9.2, 9.3, 9.4

See Note 3

11gR2, 12c
Sqoop 1 5.5 See Note 4 See Note 4 See Note 4
Sqoop 2 5.5 Default
 
  Note:
  1. MySQL 5.5 is supported on CDH 5.1. MySQL 5.6 is supported on CDH 5.1 and higher. The InnoDB storage engine must be enabled in the MySQL server.
  2. Cloudera Manager installation fails if GTID-based replication is enabled in MySQL.
  3. PostgreSQL 9.2 is supported on CDH 5.1 and higher. PostgreSQL 9.3 is supported on CDH 5.2 and higher. PostgreSQL 9.4 is supported on CDH 5.5 and higher.
  4. For purposes of transferring data only, Sqoop 1 supports MySQL 5.0 and above, PostgreSQL 8.4 and above, Oracle 10.2 and above, Teradata 13.10 and above, and Netezza TwinFin 5.0 and above. The Sqoop metastore works only with HSQLDB (1.8.0 and higher 1.x versions; the metastore does not work with any HSQLDB 2.x versions).
  5. Derby is supported as shown in the table, but not always recommended. See the pages for individual components in the Cloudera Installation and Upgrade guide for recommendations.
  6. CDH 5 Hue requires the default MySQL version of the operating system on which it is being installed, which is usually MySQL 5.1, 5.5, or 5.6.
Selected tab: SupportedDatabases
  Important: JDK 1.6 is not supported on any CDH 5 release (even though the libraries of CDH 5.0-CDH 5.4 are compatible). Applications using CDH libraries must run a supported version of JDK 1.7 or higher, and one that also matches the JDK version of your CDH cluster.
 
CDH 5.7.x is supported with the versions shown in the following table:
Minimum Supported Version Recommended Version Exceptions
1.7.0_55 1.7.0_67, 1.7.0_75, 1.7.0_80 None
1.8.0_31 1.8.0_60 Cloudera recommends that you not use JDK 1.8.0_40.
Selected tab: SupportedJDKVersions

Hue

Hue works with the two most recent versions of the following browsers. Cookies and JavaScript must be on.

  • Chrome
  • Firefox
  • Safari (not supported on Windows)
  • Internet Explorer

Hue could display in older versions and even other browsers, but you might not have access to all of its features.

Selected tab: SupportedBrowsers

 

CDH requires IPv4. IPv6 is not supported.

See also Configuring Network Names.

 

Selected tab: SupportedInternetProtocol

The following components are supported by the indicated versions of Transport Layer Security (TLS):

 

Table 1. Components Supported by TLS

Component

Role Name Port Version
Flume   Avro Source/Sink   TLS 1.2
Flume   Flume HTTP Source/Sink   TLS 1.2
HBase Master HBase Master Web UI Port 60010 TLS 1.2
HDFS NameNode Secure NameNode Web UI Port 50470 TLS 1.2
HDFS Secondary NameNode Secure Secondary NameNode Web UI Port 50495 TLS 1.2
HDFS HttpFS REST Port 14000 TLS 1.1, TLS 1.2
Hive HiveServer2 HiveServer2 Port 10000 TLS 1.2
Hue Hue Server Hue HTTP Port 8888 TLS 1.2
Cloudera Impala Impala Daemon Impala Daemon Beeswax Port 21000 TLS 1.2
Cloudera Impala Impala Daemon Impala Daemon HiveServer2 Port 21050 TLS 1.2
Cloudera Impala Impala Daemon Impala Daemon Backend Port 22000 TLS 1.2
Cloudera Impala Impala Daemon Impala Daemon HTTP Server Port 25000 TLS 1.2
Cloudera Impala Impala StateStore StateStore Service Port 24000 TLS 1.2
Cloudera Impala Impala StateStore StateStore HTTP Server Port 25010 TLS 1.2
Cloudera Impala Impala Catalog Server Catalog Server HTTP Server Port 25020 TLS 1.2
Cloudera Impala Impala Catalog Server Catalog Server Service Port 26000 TLS 1.2
Oozie Oozie Server Oozie HTTPS Port 11443 TLS 1.1, TLS 1.2
Solr Solr Server Solr HTTP Port 8983 TLS 1.1, TLS 1.2
Solr Solr Server Solr HTTPS Port 8985 TLS 1.1, TLS 1.2
YARN ResourceManager ResourceManager Web Application HTTP Port 8090 TLS 1.2
YARN JobHistory Server MRv1 JobHistory Web Application HTTP Port 19890 TLS 1.2
Selected tab: SupportedTransportLayerSecurityVersions
Selected tab: SystemRequirements

Issues Fixed in CDH 5.7.2

Upstream Issues Fixed

The following upstream issues are fixed in CDH 5.7.2:

 

  • FLUME-1899 - Make SpoolDir work with subdirectories
  • FLUME-2910 - AsyncHBaseSink: Failure callbacks should log the exception that caused them
  • FLUME-2918 - Speed up TaildirSource on directories with many files
  • HADOOP-8934 - Shell command ls should include sort options
  • HADOOP-10971 - Add -C flag to make `hadoop fs -ls` print filenames only
  • HADOOP-11409 - FileContext.getFileContext can stack overflow if default fs misconfigured
  • HADOOP-11432 - Fix SymlinkBaseTest#testCreateLinkUsingPartQualPath2.
  • HADOOP-12787 - KMS SPNEGO sequence does not work with WebHDFS
  • HADOOP-12841 - Update s3-related properties in core-default.xml.
  • HADOOP-12901 - Add warning log when KMSClientProvider cannot create a connection to the KMS server.
  • HADOOP-12963 - Allow using path style addressing for accessing the S3 endpoint.
  • HADOOP-13079 - Add -q option to Ls to print ? instead of non-printable characters
  • HADOOP-13132 - Handle ClassCastException on AuthenticationException in LoadBalancingKMSClientProvider
  • HADOOP-13155 - Implement TokenRenewer to renew and cancel delegation tokens in KMS
  • HADOOP-13251 - Authenticate with Kerberos credentials when renewing KMS delegation token
  • HADOOP-13255 - KMSClientProvider should check and renew TGT when doing delegation token operations
  • HDFS-8581 - ContentSummary on / skips further counts on yielding lock
  • HDFS-8829 - Make SO_RCVBUF and SO_SNDBUF size configurable for DataTransferProtocol sockets and allow configuring auto-tuning
  • HDFS-9085 - Show renewer information in DelegationTokenIdentifier#toString
  • HDFS-9259 - Make SO_SNDBUF size configurable at DFSClient side for hdfs write scenario.
  • HDFS-9365 - Balancer does not work with the HDFS-6376 HA setup.
  • HDFS-9405 - Warmup NameNode EDEK caches in background thread
  • HDFS-9700 - DFSClient and DFSOutputStream should set TCP_NODELAY on sockets for DataTransferProtocol
  • HDFS-9732 - Improve DelegationTokenIdentifier.toString() for better logging
  • HDFS-9805 - Add server-side configuration for enabling TCP_NODELAY for DataTransferProtocol and default it to true
  • HDFS-10360 - DataNode may format directory and lose blocks if current/VERSION is missing
  • HDFS-10381 - DataStreamer DataNode exclusion log message should be warning
  • HDFS-10396 - Using -diff option with DistCp may get "Comparison method violates its general contract" exception
  • HDFS-10481 - HTTPFS server should correctly impersonate as end user to open file
  • HDFS-10516 - Fix bug when warming up EDEK cache of more than one encryption zone
  • HDFS-10525 - Fix NPE in CacheReplicationMonitor#rescanCachedBlockMap
  • MAPREDUCE-6558 - multibyte delimiters with compressed input files generate duplicate records
  • MAPREDUCE-6577 - MR AM unable to load native library without MR_AM_ADMIN_USER_ENV set
  • MAPREDUCE-6635 - Unsafe long to int conversion in UncompressedSplitLineReader and IndexOutOfBoundsException
  • MAPREDUCE-6701 - Application Master log unavailable when clicking JobHistory's AM logs link
  • YARN-2605 - [RM HA] Rest API endpoints doing redirect incorrectly.
  • YARN-4812 - TestFairScheduler#testContinuousScheduling fails intermittently.
  • YARN-5048 - DelegationTokenRenewer#skipTokenRenewal may throw NPE
  • HBASE-11625 - Reading datablock throws "Invalid HFile block magic" and can not switch to hdfs checksum
  • HBASE-13532 - Make UnknownScannerException less scary by giving more information in the exception string.
  • HBASE-14644 - Region in transition metric is broken
  • HBASE-14818 - user_permission does not list namespace permissions
  • HBASE-15236 - Inconsistent cell reads over multiple bulk-loaded HFiles
  • HBASE-15439 - getMaximumAllowedTimeBetweenRuns in ScheduledChore ignores the TimeUnit
  • HBASE-15465 - userPermission returned by getUserPermission() for the selected namespace does not have namespace set
  • HBASE-15496 - Throw RowTooBigException only for user scan/get
  • HBASE-15698 - Increment TimeRange not serialized to server
  • HBASE-15746 - Remove extra RegionCoprocessor preClose() in RSRpcServices#closeRegion
  • HBASE-15791 - Improve javadoc around ScheduledChore
  • HBASE-15811 - Batch Get after batch Put does not fetch all cells
  • HBASE-15872 - Split TestWALProcedureStore
  • HBASE-15873 - ACL for snapshot restore / clone is not enforced
  • HBASE-15925 - provide default values for hadoop compat module related properties that match default hadoop profile.
  • HBASE-16034 - Fix ProcedureTestingUtility#LoadCounter.setMaxProcId()
  • HBASE-16056 - Procedure v2 - fix master crash for FileNotFound
  • HBASE-16093 - Fix splits failed before creating daughter regions leave meta inconsistent
  • HIVE-7443 - Fix HiveConnection to communicate with Kerberized Hive JDBC server and alternative JDKs
  • HIVE-9486 - Use session classloader instead of application loader
  • HIVE-9499 - hive.limit.query.max.table.partition makes queries fail on non-partitioned tables
  • HIVE-10685 - Alter table concatenate oparetor will cause duplicate data
  • HIVE-10925 - Non-static threadlocals in metastore code can potentially cause memory leak
  • HIVE-11031 - ORC concatenation of old files can fail while merging column statistics
  • HIVE-11243 - Changing log level in Utilities.getBaseWork
  • HIVE-11747 - Unnecessary error log is shown when executing a "INSERT OVERWRITE LOCAL DIRECTORY" cmd in the embedded mode
  • HIVE-11827 - STORED AS AVRO fails SELECT COUNT(*) when empty
  • HIVE-12742 - NULL table comparison within CASE does not work as previous hive versions
  • HIVE-12958 - Make embedded Jetty server more configurable
  • HIVE-13285 - ORC concatenation may drop old files from moving to final path
  • HIVE-13462 - HiveResultSetMetaData.getPrecision() fails for NULL columns
  • HIVE-13590 - Kerberized HS2 with LDAP auth enabled fails in multi-domain LDAP case
  • HIVE-13736 - View's input/output formats are TEXT by default.
  • HIVE-13932 - Hive SMB Map Join with small set of LIMIT failed with NPE
  • HIVE-13953 - Issues in HiveLockObject equals method
  • HIVE-13991 - Union All on view fail with no valid permission on underneath table
  • HIVE-14006 - Hive query with UNION ALL fails with ArrayIndexOutOfBoundsException.
  • HIVE-14015 - SMB MapJoin failed for Hive on Spark when kerberized
  • HIVE-14098 - Logging task properties and environment variables might contain passwords
  • HIVE-14118 - Make the alter partition exception more meaningful
  • HUE-2678 - [jobbrowser] Read Spark job data from Spark History Server API
  • HUE-3197 - [oozie] Decision node support in external Workflow graph
  • HUE-3520 - [jb] Use impersonation to access JHS if security is enabled
  • HUE-3521 - [core] Provide a force_username_uppercase option
  • HUE-3526 - [useradmin] Fix LDAP tests for force_username_uppercase
  • HUE-3688 - [oozie] Fix TestEditor.test_workflow_dependencies unit test
  • HUE-3700 - [core] Support force_username_lowercase and ignore_username_case for all Auth backends
  • HUE-3802 - [oozie] Fix HS2 action on SSL enabled cluster
  • HUE-3805 - [oozie] Add support for oozie schema 0.4 in dashboard graph for external workflows
  • HUE-3808 - [core] Do not trigger any call on initialization
  • HUE-3808 - [core] Offer to live turn on/off debug level
  • HUE-3821 - [pig] Logs are never returned on running script
  • HUE-3822 - [pig] Display logs when found
  • HUE-3861 - [core] Upgrade Django Axes to 1.5
  • HUE-3866 - [core] Hue CPU reaches ~100% usage while uploading files with SSL to HTTPFS/WebHDFS
  • HUE-3908 - [useradmin] Ignore (objectclass=*) filter when searching for LDAP users
  • HUE-3923 - [core] Simplify force debug logic option
  • HUE-4005 - [oozie] Remove oozie.coord.application.path from properties when rerunning workflow
  • HUE-4006 - [oozie] Create new deployment directory when coordinator or bundle is copied
  • HUE-4007 - [oozie] Fix deployement_dir for the bundle in oozie example fixtures
  • HUE-4021 - [libsolr] Allow customization of the Solr path in ZooKeeper
  • HUE-4023 - [useradmin] update AuthenticationForm to allow activated users to login
  • HUE-4061 - [jb] Job attempt logs not appearing for running jobs
  • HUE-4087 - [jobbrowser] Unable to kill jobs with Resource Manager HA enabled
  • HUE-4092 - [security] Can't type any / in the HDFS ACLs path input
  • HUE-4113 - [Pig] Hue breaks when user has only access to pig app
  • HUE-4134 - [liboozie] Avoid logging truststore credentials
  • HUE-4202 - [jb] Enable offset param for fetching jobbrowser logs
  • HUE-4215 - [yarn] Reset API_CACHE on logout
  • HUE-4227 - [yarn] Fix unittest for MR API Cache
  • HUE-4238 - [doc2] Ignore history docs in find_jobs_with_no_doc during sync documents
  • HUE-4252 - [core] Handle 307 redirect from YARN upon standby failover
  • HUE-4258 - [jb] Close and pool Spark History Server connections
  • IMPALA-1928 - Fix Thrift client transport wrapping order
  • IMPALA-2660 - Respect auth_to_local configs from hdfs configs
  • IMPALA-3276 - Consistently handle pin failure in BTS::PrepareForRead()
  • IMPALA-3369 - Add ALTER TABLE SET COLUMN STATS statement.
  • IMPALA-3441 - Impala should not crash for invalid avro serialized data
  • IMPALA-3499 - Split catalog update
  • IMPALA-3502 - Fix race in the coordinator while updating filter routing table
  • IMPALA-3633 - Cancel fragment if coordinator is gone
  • IMPALA-3732 - Handle string length overflow in Avro files
  • IMPALA-3745 - Corrupt encoded values in parquet files can cause crashes
  • IMPALA-3751 - Fix clang build errors and warnings
  • IMPALA-3754 - Fix TestParquet.test_corrupt_rle_counts flakiness
  • OOZIE-2314 - Unable to kill old instance child job by workflow or coord rerun by Launcher
  • OOZIE-2329 - Make handling yarn restarts configurable
  • OOZIE-2330 - Spark action should take the global jobTracker and nameNode configs by default and allow file and archive elements
  • OOZIE-2345 - Parallel job submission for forked actions
  • OOZIE-2436 - Fork/join workflow fails with oozie.action.yarn.tag must not be null
  • OOZIE-2481 - Add YARN_CONF_DIR in the Shell action
  • OOZIE-2504 - Create a log4j.properties under HADOOP_CONF_DIR in Shell Action
  • OOZIE-2511 - SubWorkflow missing variable set from option if config-default is present in parent workflow
  • OOZIE-2533 -Oozie Web UI gives Error 500 with Java 8u91
  • SENTRY-1175 - Improve usability of URI privileges when granting URIs
  • SENTRY-1201 - Sentry ignores database prefix for MSCK statement
  • SENTRY-1252 - grantServerPrivilege and revokeServerPrivilege should treat "*" and "ALL" as synonyms when action is not explicitly specified
  • SENTRY-1265 - Sentry service should not require a TGT as it is not talking to other kerberos services as a client
  • SENTRY-1292 - Reorder DBModelAction EnumSet
  • SENTRY-1293 - Avoid converting string permission to Privilege object
  • SENTRY-1311 - Improve usability of URI privileges by supporting mixed use of URIs with and without scheme
  • SENTRY-1320 - truncate table db_name.table_name fails
  • SOLR-7178 - OverseerAutoReplicaFailoverThread compares Integer objects using ==
  • SOLR-8451 - We should not call method.abort in HttpSolrClient and HttpSolrCall#remoteQuery should not close streams
  • SOLR-8497 - Merge indexes should mark its directories as done rather than keep them around in the directory cache.
  • SOLR-8691 - Cache index fingerprints per searcher
  • SOLR-9053 - Upgrade commons-fileupload to 1.3.1, fixing a potential vulnerability
  • SPARK-13278 - [CORE] Launcher fails to start with JDK 9 EA
  • SPARK-14391 - [LAUNCHER] Fix launcher communication test
  • SPARK-15067 - [YARN] YARN executors are launched with fixed perm gen size
  • SPARK-15165 - [SPARK-15205] [SQL] Introduce place holder for comments in generated code
  • SQOOP-2846 - Sqoop Export with update-key failing for avro data file
  • SQOOP-2864 - ClassWriter chokes on column names containing double quotes
  • SQOOP-2920 - Sqoop performance deteriorates significantly on wide datasets; sqoop 100% on CPU

 

 

Selected tab: WhatsNew

Want to Get Involved or Learn More?

Check out our other resources

Cloudera Community

Collaborate with your peers, industry experts, and Clouderans to make the most of your investment in Hadoop.

Cloudera University

Receive expert Hadoop training through Cloudera University, the industry's only truly dynamic Hadoop training curriculum that’s updated regularly to reflect the state of the art in big data.