Long term component architecture
As the main curator of open standards in Hadoop, Cloudera has a track record of bringing new open source solutions into its platform (such as Apache Spark, Apache HBase, and Apache Parquet) that are eventually adopted by the community at large. As standards, you can build longterm architecture on these components with confidence.
With the exception of DSSD support, Cloudera Enterprise 5.6.0 is identical to CDH 5.5.2/Cloudera Manager 5.5.3 If you do not need DSSD support, you do not need to upgrade if you are already using the latest 5.5.x release.
- System Requirements
- What's New
- Supported Operating Systems
- Supported Databases
- Supported JDK Versions
- Supported Browsers
- Supported Internet Protocol
- Supported Transport Layer Security Versions
Supported Operating Systems
|Component||MariaDB||MySQL||SQLite||PostgreSQL||Oracle||Derby - see Note 5|
|Cloudera Manager||5.5, 10||5.6, 5.5, 5.1||–||9.4, 9.3, 9.2, 9.1. 8.4, 8.3, 8.1||12c, 11gR2|
|Oozie||5.5, 10||5.6, 5.5, 5.1||–||
9.4, 9.3, 9.2, 9.1. 8.4, 8.3, 8.1
See Note 3
|Flume||–||–||–||–||–||Default (for the JDBC Channel only)|
|Hue||5.5, 10||5.6, 5.5, 5.1
See Note 6
9.4, 9.3, 9.2, 9.1. 8.4, 8.3, 8.1
See Note 3
|Hive/Impala||5.5, 10||5.6, 5.5, 5.1
See Note 1
9.4, 9.3, 9.2, 9.1. 8.4, 8.3, 8.1
See Note 3
|Sentry||5.5, 10||5.6, 5.5, 5.1
See Note 1
9.4, 9.3, 9.2, 9.1. 8.4, 8.3, 8.1
See Note 3
|Sqoop 1||5.5, 10||See Note 4||–||See Note 4||See Note 4||–|
|Sqoop 2||5.5, 10||See Note 9||–||–||–||Default|
- Cloudera supports the databases listed above provided they are supported by the underlying operating system on which they run.
- MySQL 5.5 is supported on CDH 5.1. MySQL 5.6 is supported on CDH 5.1 and higher. The InnoDB storage engine must be enabled in the MySQL server.
- Cloudera Manager installation fails if GTID-based replication is enabled in MySQL.
- PostgreSQL 9.2 is supported on CDH 5.1 and higher. PostgreSQL 9.3 is supported on CDH 5.2 and higher. PostgreSQL 9.4 is supported on CDH 5.5 and higher.
- For purposes of transferring data only, Sqoop 1 supports MySQL 5.0 and above, PostgreSQL 8.4 and above, Oracle 10.2 and above, Teradata 13.10 and above, and Netezza TwinFin 5.0 and above. The Sqoop metastore works only with HSQLDB (1.8.0 and higher 1.x versions; the metastore does not work with any HSQLDB 2.x versions).
- Derby is supported as shown in the table, but not always recommended. See the pages for individual components in the Cloudera Installation guide for recommendations.
- CDH 5 Hue requires the default MySQL version of the operating system on which it is being installed, which is usually MySQL 5.1, 5.5, or 5.6.
- When installing a JDBC driver, only the ojdbc6.jar file is supported for both Oracle 11g R2 and Oracle 12c; the ojdbc7.jar file is not supported.
- Sqoop 2 lacks some of the features of Sqoop 1. Cloudera recommends you use Sqoop 1. Use Sqoop 2 only if it contains all the features required for your use case.
- MariaDB 10 is supported only on CDH 5.9 and higher.
Supported JDK Versions
A supported minor JDK release will remain supported throughout a Cloudera major release lifecycle, from the time of its addition forward, unless specifically excluded.
Warning: JDK 1.8u40 and JDK 1.8u60 are excluded from support. Also, the Oozie Web Console returns 500 error when Oozie server runs on JDK 8u75 or higher.
Running CDH nodes within the same cluster on different JDK releases is not supported. JDK release across a cluster needs to match the patch level.
- All nodes in your cluster must run the same Oracle JDK version.
- All services must be deployed on the same Oracle JDK version.
The Cloudera Manager repository is packaged with Oracle JDK 1.7.0_67 (for example) and can be automatically installed during a new installation or an upgrade.
For a full list of supported JDK Versions please see CDH and Cloudera Manager Supported JDK Versions.
- Safari (not supported on Windows)
- Internet Explorer
Hue could display in older versions and even other browsers, but you might not have access to all of its features.
Supported Internet Protocol
CDH requires IPv4. IPv6 is not supported.
See also Configuring Network Names.
Multihoming CDH or Cloudera Manager is not supported outside specifically certified Cloudera partner appliances. Cloudera finds that current Hadoop architectures combined with modern network infrastructures and security practices remove the need for multihoming. Multihoming, however, is beneficial internally in appliance form factors to take advantage of high-bandwidth InfiniBand interconnects.
Although some subareas of the product may work with unsupported custom multihoming configurations, there are known issues with multihoming. In addition, unknown issues may arise because multihoming is not covered by our test matrix outside the Cloudera-certified partner appliances.
Supported Transport Layer Security Versions
The following components are supported by the indicated versions of Transport Layer Security (TLS):
|Cloudera Manager||Cloudera Manager Server||7182||TLS 1.2|
|Cloudera Manager||Cloudera Manager Server||7183||TLS 1.2|
|Flume||Avro Source/Sink||TLS 1.2|
|Flume||Flume HTTP Source/Sink||TLS 1.2|
|HBase||Master||HBase Master Web UI Port||60010||TLS 1.2|
|HDFS||NameNode||Secure NameNode Web UI Port||50470||TLS 1.2|
|HDFS||Secondary NameNode||Secure Secondary NameNode Web UI Port||50495||TLS 1.2|
|HDFS||HttpFS||REST Port||14000||TLS 1.1, TLS 1.2|
|Hive||HiveServer2||HiveServer2 Port||10000||TLS 1.2|
|Hue||Hue Server||Hue HTTP Port||8888||TLS 1.2|
|Impala||Impala Daemon||Impala Daemon Beeswax Port||21000||TLS 1.2|
|Impala||Impala Daemon||Impala Daemon HiveServer2 Port||21050||TLS 1.2|
|Impala||Impala Daemon||Impala Daemon Backend Port||22000||TLS 1.2|
|Impala||Impala StateStore||StateStore Service Port||24000||TLS 1.2|
|Impala||Impala Daemon||Impala Daemon HTTP Server Port||25000||TLS 1.2|
|Impala||Impala StateStore||StateStore HTTP Server Port||25010||TLS 1.2|
|Impala||Impala Catalog Server||Catalog Server HTTP Server Port||25020||TLS 1.2|
|Impala||Impala Catalog Server||Catalog Server Service Port||26000||TLS 1.2|
|Oozie||Oozie Server||Oozie HTTPS Port||11443||TLS 1.1, TLS 1.2|
|Solr||Solr Server||Solr HTTP Port||8983||TLS 1.1, TLS 1.2|
|Solr||Solr Server||Solr HTTPS Port||8985||TLS 1.1, TLS 1.2|
|Spark||History Server||18080||TLS 1.2|
|YARN||ResourceManager||ResourceManager Web Application HTTP Port||8090||TLS 1.2|
|YARN||JobHistory Server||MRv1 JobHistory Web Application HTTP Port||19890||TLS 1.2|
The following upstream issues are fixed in CDH 5.9.3:
- FLUME-2798 - Malformed Syslog messages can lead to OutOfMemoryException
- FLUME-3080 - Close failure in HDFS Sink might cause data loss
- FLUME-3085 - HDFS Sink can skip flushing some BucketWriters, might lead to data loss
- HADOOP-11400 - GraphiteSink does not reconnect to Graphite after 'broken pipe'
- HADOOP-11599 - Client#getTimeout should use IPC_CLIENT_PING_DEFAULT when IPC_CLIENT_PING_KEY is not configured
- HADOOP-12672 - RPC timeout should not override IPC ping interval
- HADOOP-13503 - Improve SaslRpcClient failure logging
- HADOOP-13988 - KMSClientProvider does not work with WebHDFS and Apache Knox w/ProxyUser
- HADOOP-14029 - Fix KMSClientProvider for non-secure proxyuser use case
- HDFS-10715 - NPE when applying AvailableSpaceBlockPlacementPolicy
- HDFS-11445 - FSCK shows overall health stauts as corrupt even one replica is corrupt
- YARN-6360 - Prevent FS state dump logger from cramming other log files
- YARN-6453 - fairscheduler-statedump.log gets generated regardless of service
- HBASE-15837 - Memstore size accounting is wrong if postBatchMutate() throws exception
- HBASE-16630 - Fragmentation in long running Bucket Cache
- HBASE-16739 - Timed out exception message should include encoded region name
- HBASE-16977 - VerifyReplication should log a printable representation of the row keys
- HBASE-17501 - guard against NPE while reading FileTrailer and HFileBlock
- HBASE-17673 - Monitored RPC Handler not shown in the WebUI
- HBASE-17688 - MultiRowRangeFilter not working correctly if given same start and stop RowKey
- HBASE-17710 - HBase in standalone mode creates directories with 777 permission
- HBASE-17717 - Explicitly use "sasl" ACL scheme for hbase superuser
- HBASE-17731 - Fractional latency reporting in MultiThreadedAction
- HBASE-17798 - RpcServer.Listener.Reader can abort due to CancelledKeyException
- HBASE-17970 - Set yarn.app.mapreduce.am.staging-dir when starting MiniMRCluster
- HBASE-18096 - Limit HFileUtil visibility and add missing annotations
- HIVE-9481 - allow column list specification in INSERT statement
- HIVE-9567 - JSON SerDe not escaping special chars when writing char/varchar data
- HIVE-11141 - Improve RuleRegExp when the Expression node stack gets huge
- HIVE-11418 - Dropping a database in an encryption zone with CASCADE and trash enabled fails
- HIVE-11428 - Performance: Struct IN() clauses are extremely slow
- HIVE-11671 - Optimize RuleRegExp in DPP codepath
- HIVE-11842 - Improve RuleRegExp by caching some internal data structures
- HIVE-13390 - Partial backport of HIVE-13390. Backported only httpclient 4.5.2 and httpcore 4.4.4 for fixing the Apache Hive SSL vulnerability bug.
- HIVE-14178 - Hive::needsToCopy should reuse FileUtils::equalsFileSystem
- HIVE-14380 - Queries on tables with remote HDFS paths fail in "encryption" checks.
- HIVE-14564 - Column Pruning generates out of order columns in SelectOperator which cause ArrayIndexOutOfBoundsException.
- HIVE-14943 - Partial Backport of HIVE-14943 - Base Implementation (of HIVE-10924)
- HIVE-16297 - Improving hive logging configuration variables
- HIVE-16394 - HoS does not support queue name change in middle of session
- HIVE-16413 - Create table as select does not check ownership of the location
- HIVE-16459 - Forward channelInactive to RpcDispatcher
- HIVE-16593 - SparkClientFactory.stop may prevent JVM from exiting
- HIVE-16646 - Alias in transform ... as clause shouldn't be case sensitive
- HIVE-16660 - Not able to add partition for views in hive when sentry is enabled
- HIVE-16693 - beeline "source" command freezes if you have a comment in it?
- HUE-4897 - [core] Import document call should provide information about the import
- HUE-5225 - [core] Prevent Oozie and Job Designer duplicate example documents from being installed
- HUE-5303 - [editor] Avoid XSS in the viewmodel options
- HUE-5349 - [search] Query definitions can include js XSS injection
- HUE-5659 - [home] Ignore history dependencies when importing document from different cluster
- HUE-5816 - Changing default setting as "allowed_hosts=*"
- HUE-5850 - [sentry] Prevent creating roles with empty names
- HUE-6075 - [oozie] Remove email body for schema 0.1
- HUE-6109 - [core] Remove the restriction on Document2 invalid chars
- HUE-6115 - [core] Fix document paths for names with unicode characters
- HUE-6131 - [hive] Select partition values based on the actual datatypes of the partition column
- HUE-6133 - [home] Stricter check for activeEntry existence
- HUE-6133 - [home] Avoid search blinking
- HUE-6133 - [home] Typing on the search box crashes IE 11
- HUE-6144 - [oozie] Add generic XSL template to workflow graph parser
- HUE-6161 - [doc2] Log failures and continue while converting documents
- HUE-6193 - [converter] Retain last_executed time when creating doc2 object
- HUE-6197 - [impala] Fix XSS Vulnerability in the old editors' error messages
- HUE-6212 - [oozie] Prevent XSS injection from packets
- HUE-6212 - [oozie] Prevent XSS injection in coordinator cron frequency field
- HUE-6228 - [core] Disable touchscreen detection on Nicescroll
- HUE-6250 - [frontend] Losing # fragment of full URL on login redirect
- HUE-6261 - [oozie] Avoid JS error preventing workflow action status update
- HUE-6262 - [core] Converter should separate history docs from saved docs
- HUE-6263 - [converter] Delete Doc2 object incase of exception
- HUE-6264 - [converter] Decrease memory usage for users with very high document1 objects
- HUE-6266 - [converter] Remove unnecessary call to document link
- HUE-6295 - [doc2] Avoid unrelated DB calls in sync_documents after import
- HUE-6310 - [doc2] [doc2] Create missing doc1 links for delete and copy operations
- HUE-6407 - [pig] Play button doesn't come back after killing the running pig job
- HUE-6446 - [oozie] User cant edit shared coordinator or bundle
- HUE-6604 - [oozie] Fix timestamp conversion to server timezone
- HUE-6710 - [notebook] Raise Django 403 builtin
- HUE-6710 - [notebook] Application reachable directly by users without granted access
- IMPALA-3794 - Workaround for Breakpad ID conflicts
- IMPALA-4293 - query profile should include error log
- IMPALA-4383 - Ensure plan fragment report thread is always started
- IMPALA-4409 - respect lock order in QueryExecState::CancelInternal()
- IMPALA-4615 - Fix create_table.sql command order
- IMPALA-4787 - Optimize APPX_MEDIAN() memory usage
- IMPALA-5088 - Fix heap buffer overflow
- IMPALA-5193 - Initialize decompressor before finding first tuple
- IMPALA-5197 - Erroneous corrupted Parquet file message
- IMPALA-5252 - Fix crash in HiveUdfCall::GetStringVal() when mem_limit exceeded
- IMPALA-5253 - Use appropriate transport for StatestoreSubscriber
- IMPALA-5355 - Fix the order of Sentry roles and privileges
- IMPALA-5469 - Fix exception when processing catalog update
- OOZIE-2739 - Remove property expansion pattern from ShellMain's log4j properties content
- OOZIE-2816 - Strip out the first command word from Sqoop action if its "sqoop"
- OOZIE-2818 - Can't overwrite oozie.action.max.output.data on a per-workflow basis
- OOZIE-2844 - Increase stability of Oozie actions when log4j.properties is missing or not readable
- PARQUET-389 - Support predicate push down on missing columns.
- SENTRY-1422 - JDO deadlocks while processing grant while a background thread processes Notificationlogs
- SENTRY-1476 - SentryStore is subject to JDQL injection
- SENTRY-1505 - CommitContext isn't used by anything and should be removed
- SENTRY-1515 - Cleanup exception handling in SentryStore
- SENTRY-1517 - SentryStore should actually use function getMSentryRole to get roles
- SENTRY-1557 - getRolesForGroups(),getRoleNamesForGroups() does too many trips to the the DB
- SENTRY-1594 - TransactionBlock should become generic
- SENTRY-1609 - DelegateSentryStore is subject to JDQL injection
- SENTRY-1615 - SentryStore should not allocate empty objects that are immediately returned
- SENTRY-1625 - PrivilegeOperatePersistence can use QueryParamBuilder
- SENTRY-1714 - MetastorePlugin.java should quetly return from renameAuthzObject() when both paths are null
- SENTRY-1759 - UpdatableCache leaks connections
- SOLR-8836 - Return 400, and a SolrException when an invalid json is provided to the update handler instead of 500.
- SOLR-9153 - Update Apache commons beanutils version to 1.9.2
- SOLR-9527 - Improve distribution of replicas when restoring a collection
- SOLR-9848 - Lower solr.cloud.wait-for-updates-with-stale-state-pause back down from 7 seconds.
- SOLR-10076 - Hiding keystore and truststore passwords from /admin/info/* outputs
- SOLR-10430 - Add ls command to ZkCLI for listing sub-dirs
- SPARK-14930 - Race condition in CheckpointWriter.stop()
- SPARK-16533 - Spark application not handling preemption messages
- SPARK-16845 - org.apache .spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" grows beyond 64 KB
- SPARK-16873 - force spill NPE
- SPARK-17316 - Don't block StandaloneSchedulerBackend.executorRemoved
- SPARK-17485 - Failed remote cached block reads can lead to whole job failure
- SPARK-19019 - PySpark does not work with Python 3.6.0
- SPARK-19263 - DAGScheduler should avoid sending conflicting task set.
- SPARK-19537 - Move pendingPartitions to ShuffleMapStage.
- SPARK-19688 - Spark on Yarn Credentials File set to different application directory
- SPARK-20922 - Unsafe deserialization in Spark LauncherConnection
- SQOOP-3123 - Import from oracle using oraoop with map-column-java to avro fails if special characters encounter in table name or column name
- ZOOKEEPER-2040 - Server to log underlying cause of SASL connection problems
Want to Get Involved or Learn More?
Check out our other resources
Receive expert Hadoop training through Cloudera University, the industry's only truly dynamic Hadoop training curriculum that’s updated regularly to reflect the state of the art in big data.