Long term component architecture
As the main curator of open standards in Hadoop, Cloudera has a track record of bringing new open source solutions into its platform (such as Apache Spark, Apache HBase, and Apache Parquet) that are eventually adopted by the community at large. As standards, you can build longterm architecture on these components with confidence.
With the exception of DSSD support, Cloudera Enterprise 5.6.0 is identical to CDH 5.5.2/Cloudera Manager 5.5.3 If you do not need DSSD support, you do not need to upgrade if you are already using the latest 5.5.x release.
- System Requirements
- What's New
- Supported Operating Systems
- Supported Databases
- Supported JDK Versions
- Supported Browsers
- Supported Internet Protocol
- Supported Transport Layer Security Versions
Supported Operating Systems
Please see Cloudera Manager Supported Databases for a full list of supported databases for each version of Cloudera Manager.
Cloudera Manager and CDH come packaged with an embedded PostgreSQL database, but it is recommended that you configure your cluster with custom external databases, especially in production.
In most cases (but not all), Cloudera supports versions of MariaDB, MySQL and PostgreSQL that are native to each supported Linux distribution.
After installing a database, upgrade to the latest patch and apply appropriate updates. Available updates may be specific to the operating system on which it is installed.
- Use UTF8 encoding for all custom databases.
- Cloudera Manager installation fails if GTID-based replication is enabled in MySQL.
- Hue requires the default MySQL/MariaDB version (if used) of the operating system on which it is installed. See Hue Databases.
- Both the Community and Enterprise versions of MySQL are supported, as well as MySQL configured by the AWS RDS service.
Important: When you restart processes, the configuration for each of the services is redeployed using information saved in the Cloudera Manager database. If this information is not available, your cluster does not start or function correctly. You must schedule and maintain regular backups of the Cloudera Manager database to recover the cluster in the event of the loss of this database.
Supported JDK Versions
A supported minor JDK release will remain supported throughout a Cloudera major release lifecycle, from the time of its addition forward, unless specifically excluded.
Warning: JDK 1.8u40 and JDK 1.8u60 are excluded from support. Also, the Oozie Web Console returns 500 error when Oozie server runs on JDK 8u75 or higher.
Running CDH nodes within the same cluster on different JDK releases is not supported. JDK release across a cluster needs to match the patch level.
- All nodes in your cluster must run the same Oracle JDK version.
- All services must be deployed on the same Oracle JDK version.
The Cloudera Manager repository is packaged with Oracle JDK 1.7.0_67 (for example) and can be automatically installed during a new installation or an upgrade.
For a full list of supported JDK Versions please see CDH and Cloudera Manager Supported JDK Versions.
- Chrome: Version history
- Firefox: Version history
- Internet Explorer: Version history
- Safari (Mac only): Version history
Hue can display in older, and other, browsers, but you might not have access to all of its features.Important: To see all icons in the Hue Web UI, users with IE and HTTPS must add a Load Balancer.
Supported Internet Protocol
CDH requires IPv4. IPv6 is not supported.
See also Configuring Network Names.
Multihoming CDH or Cloudera Manager is not supported outside specifically certified Cloudera partner appliances. Cloudera finds that current Hadoop architectures combined with modern network infrastructures and security practices remove the need for multihoming. Multihoming, however, is beneficial internally in appliance form factors to take advantage of high-bandwidth InfiniBand interconnects.
Although some subareas of the product may work with unsupported custom multihoming configurations, there are known issues with multihoming. In addition, unknown issues may arise because multihoming is not covered by our test matrix outside the Cloudera-certified partner appliances.
Supported Transport Layer Security Versions
The following components are supported by the indicated versions of Transport Layer Security (TLS):
Components Supported by TLS
|Cloudera Manager||Cloudera Manager Server||7182||TLS 1.2|
|Cloudera Manager||Cloudera Manager Server||7183||TLS 1.2|
|Flume||Avro Source/Sink||TLS 1.2|
|Flume||Flume HTTP Source/Sink||TLS 1.2|
|HBase||Master||HBase Master Web UI Port||60010||TLS 1.2|
|HDFS||NameNode||Secure NameNode Web UI Port||50470||TLS 1.2|
|HDFS||Secondary NameNode||Secure Secondary NameNode Web UI Port||50495||TLS 1.2|
|HDFS||HttpFS||REST Port||14000||TLS 1.1, TLS 1.2|
|Hive||HiveServer2||HiveServer2 Port||10000||TLS 1.2|
|Hue||Hue Server||Hue HTTP Port||8888||TLS 1.2|
|Impala||Impala Daemon||Impala Daemon Beeswax Port||21000||TLS 1.2|
|Impala||Impala Daemon||Impala Daemon HiveServer2 Port||21050||TLS 1.2|
|Impala||Impala Daemon||Impala Daemon Backend Port||22000||TLS 1.2|
|Impala||Impala StateStore||StateStore Service Port||24000||TLS 1.2|
|Impala||Impala Daemon||Impala Daemon HTTP Server Port||25000||TLS 1.2|
|Impala||Impala StateStore||StateStore HTTP Server Port||25010||TLS 1.2|
|Impala||Impala Catalog Server||Catalog Server HTTP Server Port||25020||TLS 1.2|
|Impala||Impala Catalog Server||Catalog Server Service Port||26000||TLS 1.2|
|Oozie||Oozie Server||Oozie HTTPS Port||11443||TLS 1.1, TLS 1.2|
|Solr||Solr Server||Solr HTTP Port||8983||TLS 1.1, TLS 1.2|
|Solr||Solr Server||Solr HTTPS Port||8985||TLS 1.1, TLS 1.2|
|Spark||History Server||18080||TLS 1.2|
|YARN||ResourceManager||ResourceManager Web Application HTTP Port||8090||TLS 1.2|
|YARN||JobHistory Server||MRv1 JobHistory Web Application HTTP Port||19890||TLS 1.2|
Upstream Issues Fixed
The following upstream issues are fixed in CDH 5.11.1:
- FLUME-3080 - Close failure in HDFS Sink might cause data loss
- FLUME-3085 - HDFS Sink can skip flushing some BucketWriters, might lead to data loss
- HADOOP-11400 - GraphiteSink does not reconnect to Graphite after 'broken pipe'
- HADOOP-11599 - Client#getTimeout should use IPC_CLIENT_PING_DEFAULT when IPC_CLIENT_PING_KEY is not configured
- HADOOP-12672 - RPC timeout should not override IPC ping interval
- HADOOP-13503 - Improve SaslRpcClient failure logging
- HADOOP-13926 - S3Guard: S3AFileSystem::listLocatedStatus() to employ MetadataStore
- HADOOP-14019 - Fix some typos in the s3a docs
- HADOOP-14028 - S3A BlockOutputStreams doesn't delete temporary files in multipart uploads or handle part upload failures
- HADOOP-14051 - S3Guard: link docs from index, fix typos
- HADOOP-14059 - typo in s3a rename(self, subdir) error message
- HADOOP-14092 - Typo in hadoop-aws index.md
- HADOOP-14104 - Revert "Client should always ask namenode for kms provider path
- HADOOP-14104 - Client should always ask namenode for kms provider path
- HADOOP-14144 - s3guard: CLI diff non-empty after import on new table
- HADOOP-14172 - S3Guard: import does not import empty directory
- HADOOP-14195 - CredentialProviderFactory$getProviders is not thread-safe
- HADOOP-14204 - S3A multipart commit failing, "UnsupportedOperationException at java.util.Collections$UnmodifiableList.sort".
- HADOOP-14215 - DynamoDB client should waitForActive on existing tables
- HADOOP-14236 - S3Guard: S3AFileSystem::rename() should move non-listed sub-directory entries
- HADOOP-14255 - Revert "S3A to delete unnecessary fake directory objects in mkdirs()
- HADOOP-14255 - S3A to delete unnecessary fake directory objects in mkdirs()
- HADOOP-14256 - [S3A DOC] Correct the format for "Seoul" example
- HADOOP-14268 - Fix markdown itemization in hadoop-aws documents
- HADOOP-14282 - S3Guard: DynamoDBMetadata::prune() should self interrupt correctly
- HADOOP-14417 - Update default SSL cipher list for KMS
- HDFS-10715 - NPE when applying AvailableSpaceBlockPlacementPolicy
- HDFS-10797 - Revert "Disk usage summary of snapshots causes renamed blocks to get counted twice
- HDFS-11499 - Decommissioning stuck because of failing recovery
- HDFS-11515 - Revert "-du throws ConcurrentModificationException
- HDFS-11515 - -du throws ConcurrentModificationException
- HDFS-11689 - Revert "New exception thrown by DFSClient%isHDFSEncryptionEnabled broke hacky hive code
- HDFS-11689 - New exception thrown by DFSClient%isHDFSEncryptionEnabled broke hacky hive code
- HDFS-11816 - Update default SSL cipher list for HttpFS
- YARN-3251 - Fixed a deadlock in CapacityScheduler when computing absoluteMaxAvailableCapacity in LeafQueue
- YARN-6042 - Dump scheduler and queue state information into FairScheduler DEBUG log.
- YARN-6042 - Revert "Dump scheduler and queue state information into FairScheduler DEBUG log. (Yufei Gu via rchiang)"
- YARN-6042 - Dump scheduler and queue state information into FairScheduler DEBUG log.
- YARN-6360 - Prevent FS state dump logger from cramming other log files
- YARN-6432 - FairScheduler: Reserve preempted resources for corresponding applications.
- YARN-6433 - Only accessible cgroup mount directories should be selected for a controller.
- YARN-6453 - fairscheduler-statedump.log gets generated regardless of service
- YARN-6500 - Do not mount inaccessible cgroups directories in CgroupsLCEResourcesHandler.
- HBASE-15125 - BackportHBaseFsck's adoptHdfsOrphan function creates region with wrong end key boundary
- HBASE-15941 - HBCK repair should not unsplit healthy splitted region
- HBASE-15955 - Disable action in CatalogJanitor#setEnabled should wait for active cleanup scan to finish
- HBASE-16032 - Possible memory leak in StoreScanner
- HBASE-16238 - It's useless to catch SESSIONEXPIRED exception and retry in RecoverableZooKeeper
- HBASE-16350 - Undo server abort from HBASE-14968
- HBASE-16429 - FSHLog: deadlock if rollWriter called when ring buffer filled with appends
- HBASE-16663 - JMX ConnectorServer stopped when unauthorized user try to stop HM/RS/cluster
- HBASE-16721 - Concurrency issue in WAL unflushed seqId tracking
- HBASE-17265 - Region left unassigned in master failover when region failed to open
- HBASE-17328 - Properly dispose of looped replication peers
- HBASE-17460 - enable_table_replication can not perform cyclic replication of a table
- HBASE-17717 - Explicitly use "sasl" ACL scheme for hbase superuser
- HBASE-17779 - disable_table_replication returns misleading message and does not turn off replication
- HBASE-17792 - Use a shared thread pool for AtomicityWriter, AtomicGetReader, AtomicScanReader's connections in TestAcidGuarantees
- HIVE-11141 - Improve RuleRegExp when the Expression node stack gets huge
- HIVE-11428 - Performance: Struct IN() clauses are extremely slow
- HIVE-11671 - Optimize RuleRegExp in DPP codepath
- HIVE-11842 - Improve RuleRegExp by caching some internal data structures
- HIVE-12179 - Add option to not add spark-assembly.jar to Hive classpath
- HIVE-12768 - Thread safety: binary sortable serde decimal deserialization
- HIVE-14210 - ExecDriver should call jobclient.close() to trigger cleanup
- HIVE-14380 - Queries on tables with remote HDFS paths fail in "encryption" checks.
- HIVE-15282 - Different modification times are used when an index is built and when its staleness is checked
- HIVE-15782 - query on parquet table returns incorrect result when hive.optimize.index.filter is set to true
- HIVE-15879 - Fix HiveMetaStoreChecker.checkPartitionDirs method
- HIVE-15997 - Resource leaks when query is cancelled
- HIVE-16024 - MSCK Repair Requires nonstrict hive.mapred.mode
- HIVE-16047 - Shouldn't try to get KeyProvider unless encryption is enabled
- HIVE-16156 - FileSinkOperator should delete existing output target when renaming
- HIVE-16175 - Possible race condition in InstanceCache
- HIVE-16205 - Improving type safety in Objectstore
- HIVE-16297 - Improving hive logging configuration variables
- HIVE-16394 - HoS does not support queue name change in middle of session
- HIVE-16459 - Forward channelInactive to RpcDispatcher
- HIVE-16646 - Alias in transform ... as clause shouldn't be case sensitive
- HUE-6109 - [core] Remove the restriction on Document2 invalid chars
- HUE-6144 - [oozie] Add generic XSL template to workflow graph parser
- HUE-6154 - [core] Ace paste event shouldn't reset the cursor position
- HUE-6155 - [core] Remove table extender header throttling for Firefox
- HUE-6158 - [autocomplete] The autocompleter eats characters to the right of the cursor on insert
- HUE-6193 - [converter] Retain last_executed time when creating doc2 object
- HUE-6197 - [impala] Fix XSS Vulnerability in the old editors' error messages
- HUE-6207 - [editor] Avoid to always show the horizontal scroll bar when there's no scrolling needed
- HUE-6208 - [core] The scroll left anchor should reset the horizontal scrollbar position
- HUE-6212 - [oozie] Prevent XSS injection in coordinator cron frequency field
- HUE-6223 - [autocomplete] Fix issue where tables from subsequent statements appear in the autocomplete results
- HUE-6228 - [core] Disable touchscreen detection on Nicescroll
- HUE-6251 - [editor] Log warnings and continue on failed bulk delete and copy actions
- IMPALA-3641 - Fix catalogd RPC responses to DROP IF EXISTS.
- IMPALA-4088 - Assign fix values to the minicluster server ports
- IMPALA-4293 - query profile should include error log
- IMPALA-4544 - ASAN should ignore SEGV and leaks
- IMPALA-4615 - Fix create_table.sql command order
- IMPALA-4733 - Change HBase ports to non-ephemeral
- IMPALA-4787 - Optimize APPX_MEDIAN() memory usage
- IMPALA-4822 - Implement dynamic log level changes
- IMPALA-4899 - Fix parquet table writer dictionary leak
- IMPALA-4902 - Copy parameters map in HdfsPartition.toThrift().
- IMPALA-4998 - Fix missing table lock acquisition.
- IMPALA-5028 - Lock table in /catalog_objects endpoint.
- IMPALA-5055 - Fix DCHECK in parquet-column-readers.cc ReadPageHeader()
- IMPALA-5088 - Fix heap buffer overflow
- IMPALA-5115 - Handle status from HdfsTableSink::WriteClusteredRowBatch
- IMPALA-5145 - Do not constant fold null in CastExprs
- IMPALA-5156 - Drop VLOG level passed into Kudu client
- IMPALA-5186 - Handle failed CreateAndOpenScanner() in MT scan.
- IMPALA-5193 - Initialize decompressor before finding first tuple
- IMPALA-5251 - Fix propagation of input exprs' types in 2-phase agg
- IMPALA-5252 - Fix crash in HiveUdfCall::GetStringVal() when mem_limit exceeded
- IMPALA-5253 - Use appropriate transport for StatestoreSubscriber
- IMPALA-5322 - Fix a potential crash in Frontend & Catalog JNI startup
- OOZIE-2739 - Remove property expansion pattern from ShellMain's log4j properties content
- OOZIE-2818 - Can't overwrite oozie.action.max.output.data on a per-workflow basis
- OOZIE-2819 - Make Oozie REST API accept multibyte characters for script Actions
- SENTRY-1508 - Revert ""REVERT: MetastorePlugin.java does not handle properly initialization failure (Vadim Spector, Reviewed by: Sravya Tirukkovalur, Alexander Kolbasov and Hao Hao)""
- SENTRY-1605 - SENTRY-1508 need to be fixed because of Kerberos initialization issue
- SENTRY-1683 - MetastoreCacheInitializer has a race condition in handling results list
- SENTRY-1714 - MetastorePlugin.java should quetly return from renameAuthzObject() when both paths are null
- SOLR-9836 - Add ability to recover from leader when index corruption is detected on SolrCore creation.
- SOLR-9848 - Lower solr.cloud.wait-for-updates-with-stale-state-pause back down from 7 seconds.
- SOLR-10360 - Remove an extra space from Hadoop distcp cmd used by Solr backup/restore
- SOLR-10430 - Add ls command to ZkCLI for listing sub-dirs
- SPARK-14930 - [SPARK-13693] Fix race condition in CheckpointWriter.stop()
- SPARK-19178 - [SQL][Backport-to-1.6] convert string of large numbers to int should return null
- SPARK-19263 - DAGScheduler should avoid sending conflicting task set.
- SPARK-19537 - Move pendingPartitions to ShuffleMapStage.
- SPARK-20435 - [CORE] More thorough redaction of sensitive information
- SQOOP-3123 - Introduce escaping logic for column mapping parameters (same what Sqoop already uses for the DB column names), thus special column names (e.g. containing '#' character) and mappings realted to those columns can be in the same format (thus not confusing the end users), and also eliminates the related AVRO format clashing issues.
- SQOOP-3140 - Removing deprecated mapred.map.max.attempts, mapred.reduce.max.attempts entries and using the new constants directly from Hadoop instead
- SQOOP-3159 - Sqoop (export + --table) with Oracle table_name having '$' fails with error
Want to Get Involved or Learn More?
Check out our other resources
Receive expert Hadoop training through Cloudera University, the industry's only truly dynamic Hadoop training curriculum that’s updated regularly to reflect the state of the art in big data.