Long term component architecture
As the main curator of open standards in Hadoop, Cloudera has a track record of bringing new open source solutions into its platform (such as Apache Spark, Apache HBase, and Apache Parquet) that are eventually adopted by the community at large. As standards, you can build long term architecture on these components with confidence.
- System Requirements
- What's New
- Supported Operating Systems
- Supported Databases
- Supported JDK Versions
- Supported Browsers
- Supported Internet Protocol
- Supported Transport Layer Security Versions
Supported Operating Systems
Please see Cloudera Manager Supported Databases for a full list of supported databases for each version of Cloudera Manager.
Cloudera Manager and CDH come packaged with an embedded PostgreSQL database, but it is recommended that you configure your cluster with custom external databases, especially in production.
In most cases (but not all), Cloudera supports versions of MariaDB, MySQL and PostgreSQL that are native to each supported Linux distribution.
After installing a database, upgrade to the latest patch and apply appropriate updates. Available updates may be specific to the operating system on which it is installed.
- Use UTF8 encoding for all custom databases.
- Cloudera Manager installation fails if GTID-based replication is enabled in MySQL.
- Hue requires the default MySQL/MariaDB version (if used) of the operating system on which it is installed. See Hue Databases.
- Both the Community and Enterprise versions of MySQL are supported, as well as MySQL configured by the AWS RDS service.
Important: When you restart processes, the configuration for each of the services is redeployed using information saved in the Cloudera Manager database. If this information is not available, your cluster does not start or function correctly. You must schedule and maintain regular backups of the Cloudera Manager database to recover the cluster in the event of the loss of this database.
Supported JDK Versions
Unless specifically excluded, support for a minor JDK release begins from the Cloudera major release in which support for the major JDK release was added. For example, 8u102 was released in time for C5.9 but is actually supported from C5.3 because that is when support for JDK 1.8 was added. Cloudera excludes or removes support for select Java updates when security is jeopardized.
Running CDH nodes within the same cluster on different JDK releases is not supported. JDK release across a cluster needs to match the patch level.
- All nodes in your cluster must run the same Oracle JDK version.
- All services must be deployed on the same Oracle JDK version.
All JDK 7 updates, from the minimum required version, are supported in CM/CDH 5.0 and higher unless specifically excluded. Updates above the minimum that are not listed are supported but not tested.
The Cloudera Manager repository is packaged with Oracle JDK 1.7.0_67 (for example) and can be automatically installed during a new installation or an upgrade.
JDK 7 updates that are supported and tested
|JDK 7||Supported in all C5.x|
|1.7u80||Recommended / Latest version supported|
All JDK 8 updates, from the minimum required version, are supported in CM/CDH 5.3 and higher unless specifically excluded. Updates above the minimum that are not listed are supported but not tested.
Warning: JDK 8u40, 8u45, and 8u60 are excluded from support due to a security risk: HTTP authentication can fail for web-based UI components such as HDFS, YARN, SOLR, and Oozie.Important: JDK 8u75 is supported but has a Known Issue: Oozie Web Console returns 500 error when Oozie server runs on JDK 8u75 or higher.
JDK 8 updates that are supported and tested
|JDK 8||Supported in C5.3 and Higher|
|1.8u121||Recommended / Latest version supported|
- Chrome: Version history
- Firefox: Version history
- Internet Explorer: Version history
- Safari (Mac only): Version history
Hue can display in older, and other, browsers, but you might not have access to all of its features.
Important: To see all icons in the Hue Web UI, users with IE and HTTPS must add a Load Balancer.
Supported Internet Protocol
CDH requires IPv4. IPv6 is not supported.
See also Configuring Network Names.
Multihoming CDH or Cloudera Manager is not supported outside specifically certified Cloudera partner appliances. Cloudera finds that current Hadoop architectures combined with modern network infrastructures and security practices remove the need for multihoming. Multihoming, however, is beneficial internally in appliance form factors to take advantage of high-bandwidth InfiniBand interconnects.
Although some subareas of the product may work with unsupported custom multihoming configurations, there are known issues with multihoming. In addition, unknown issues may arise because multihoming is not covered by our test matrix outside the Cloudera-certified partner appliances.
Supported Transport Layer Security Versions
The following components are supported by the indicated versions of Transport Layer Security (TLS):
Components Supported by TLS
|Cloudera Manager||Cloudera Manager Server||7182||TLS 1.2|
|Cloudera Manager||Cloudera Manager Server||7183||TLS 1.2|
|Flume||Avro Source/Sink||TLS 1.2|
|Flume||Flume HTTP Source/Sink||TLS 1.2|
|HBase||Master||HBase Master Web UI Port||60010||TLS 1.2|
|HDFS||NameNode||Secure NameNode Web UI Port||50470||TLS 1.2|
|HDFS||Secondary NameNode||Secure Secondary NameNode Web UI Port||50495||TLS 1.2|
|HDFS||HttpFS||REST Port||14000||TLS 1.1, TLS 1.2|
|Hive||HiveServer2||HiveServer2 Port||10000||TLS 1.2|
|Hue||Hue Server||Hue HTTP Port||8888||TLS 1.2|
|Impala||Impala Daemon||Impala Daemon Beeswax Port||21000||TLS 1.2|
|Impala||Impala Daemon||Impala Daemon HiveServer2 Port||21050||TLS 1.2|
|Impala||Impala Daemon||Impala Daemon Backend Port||22000||TLS 1.2|
|Impala||Impala StateStore||StateStore Service Port||24000||TLS 1.2|
|Impala||Impala Daemon||Impala Daemon HTTP Server Port||25000||TLS 1.2|
|Impala||Impala StateStore||StateStore HTTP Server Port||25010||TLS 1.2|
|Impala||Impala Catalog Server||Catalog Server HTTP Server Port||25020||TLS 1.2|
|Impala||Impala Catalog Server||Catalog Server Service Port||26000||TLS 1.2|
|Oozie||Oozie Server||Oozie HTTPS Port||11443||TLS 1.1, TLS 1.2|
|Solr||Solr Server||Solr HTTP Port||8983||TLS 1.1, TLS 1.2|
|Solr||Solr Server||Solr HTTPS Port||8985||TLS 1.1, TLS 1.2|
|Spark||History Server||18080||TLS 1.2|
|YARN||ResourceManager||ResourceManager Web Application HTTP Port||8090||TLS 1.2|
|YARN||JobHistory Server||MRv1 JobHistory Web Application HTTP Port||19890||TLS 1.2|
Issues Fixed in CDH 5.12.1
The following upstream issues are fixed in CDH 5.12.1:
- FLUME-2752 - Fix AvroSource startup resource leaks
- FLUME-2905 - Fix NetcatSource file descriptor leak if startup fails
- HADOOP-13588 - ConfServlet should respect Accept request header
- HADOOP-13628 - Support to retrieve specific property from configuration via REST API
- HADOOP-14260 - Configuration. dumpConfiguration should redact sensitive information
- HADOOP-14511 - WritableRpcEngine.Invocation#toString NPE on null parameters
- HADOOP-14542 - Add IOUtils. cleanupWithLogger that accepts slf4j logger API
- HDFS-8856 - Make LeaseManager#countPath O(1).
- HDFS-10468 - HDFS read ends up ignoring an interrupt
- HDFS-10506 - OIV's ReverseXML processor cannot reconstruct some snapshot details
- HDFS-11303 - Hedged read might hang infinitely if read data from all DN failed
- HDFS-11708 - Positional read will fail if replicas moved to different DNs after stream is opened
- HDFS-11741 - Long running balancer may fail due to expired DataEncryptionKey
- HDFS-11861 - ipc.Client. Connection#sendRpcRequest should log request name
- HDFS-11881 - NameNode consumes a lot of memory for snapshot diff report generation
- HDFS-11960 - Successfully closed files can stay under-replicated
- HDFS-12042 - Lazy initialize AbstractINodeDiffList#diffs for snapshots to reduce memory consumption
- HDFS-12139 - HTTPFS liststatus returns incorrect pathSuffix for path of file
- HDFS-12278 - LeaseManager operations are inefficient in 2.8
- MAPREDUCE-6870 - Add configuration for MR job to finish when all reducers are complete.
- YARN-2780 - Log aggregated resource allocation in rm-appsummary.log
- HBASE-15720 - Print row locks at the debug dump page
- HBASE-16033 - Add more details in logging of responseTooSlow/TooLarge
- HBASE-17131 - Avoid livelock caused by HRegion#processRowsWithLocks
- HBASE-17587 - Do not Rethrow DoNotRetryIOException as UnknownScannerException
- HBASE-18247 - Hbck to fix the case that replica region shows as key in the meta table
- HBASE-18362 - hbck should not report split replica parent region from meta as errors
- HIVE-9567 - JSON SerDe not escaping special chars when writing char/varchar data
- HIVE-10209 - FetchTask with VC may fail because ExecMapper.done is true
- HIVE-11240 - Change value type from int to long for HiveConf.ConfVars.METASTORESERVERMAXMESSAGESIZE
- HIVE-11462 - GenericUDFStruct should constant fold at compile time.
- HIVE-11592 - ORC metadata section can sometimes exceed protobuf message size limit
- HIVE-12274 - Increase width of columns used for general configuration in the metastore.
- HIVE-12551 - Fix several kryo exceptions in branch-1
- HIVE-12762 - Common join on parquet tables returns incorrect result when hive.optimize.index. filter set to true
- HIVE-13330 - ORC vectorized string dictionary reader does not differentiate null vs empty string dictionary
- HIVE-13588 - NPE is thrown from MapredLocalTask.executeInChildVM
- HIVE-13947 - HoS prints wrong number for hash table size in map join scenario
- HIVE-14178 - Hive::needsToCopy should reuse FileUtils::equalsFileSystem
- HIVE-15122 - Hive: Upcasting types should not obscure stats (min/max/ndv)
- HIVE-15792 - Hive should raise SemanticException when LPAD/RPAD pad character's length is 0
- HIVE-16183 - Fix potential thread safety issues with static variables
- HIVE-16291 - Hive fails when unions a parquet table with itself
- HIVE-16559 - Parquet schema evolution for partitioned tables may break if table and partition serdes differ
- HIVE-16729 - Improve location validator to check for blank paths
- HIVE-16845 - INSERT OVERWRITE a table with dynamic partitions on S3 fails with NPE
- HIVE-16869 - Hive returns wrong result when predicates on non-existing columns are pushed down to Parquet reader
- HIVE-16875 - Query against view with partitioned child on HoS fails with privilege exception.
- HIVE-16930 - HoS should verify the value of Kerberos principal and keytab file before adding them to spark-submit command parameters
- HIVE-16935 - Hive should strip comments from input before choosing which CommandProcessor to run.
- HIVE-16974 - Change the sort key for the schema tool validator to be <ID>
- HIVE-17050 - Multiline queries that have comment in middle fail when executed via "beeline -e"
- HIVE-17052 - Remove logging of predicate filters
- HIVE-17149 - Hdfs directory is not cleared if partition creation failed on HMS
- HUE-5217 - [backend] Resolve LGPL copyleft issue - Paramiko in boto 2.46.1
- HUE-5504 - [oozie] Only use JDBC URL from hive2 action when hardcoded
- HUE-6684 - [editor] The horizontal scrollbar of the ace editor isn't updated when the right assist is resized.
- HUE-6701 - [editor] The right panel resize bar gets hidden in Hue 3 on window resize.
- HUE-6702 - [about] About page is accessible without authentication
- HUE-6704 - [fb] Handle empty page error on filebrowser filter requests
- HUE-6705 - [fb] Opening directory after opening editor does not load at first click
- HUE-6709 - [editor] Column list can jump back and forth depending on its size
- HUE-6710 - [notebook] Application reachable directly by users without granted access
- HUE-6711 - [frontend] Render 403 page properly on Hue 4
- HUE-6714 - [jb] No tasks found for job when refreshing MapReduce job page
- HUE-6724 - [jb] Job progress can have too many decimals
- HUE-6730 - [fb] Use SkipFile instead of StopUpload in case of exception during file upload
- HUE-6735 - [jb] A suspended workflow should also be killable
- HUE-6738 - [jb] Kill button of workflow enabled when clicking on re-run
- HUE-6742 - [editor] Incorrect Sorting With TIMESTAMP Column Type
- HUE-6745 - [frontend] The title of the browser is Filebrowser even if I’m in editor
- HUE-6748 - [editor] Select and copy the query from the query history
- HUE-6749 - [editor] From metastore to editor with past query ran and scrolled can garble grid
- HUE-6758 - [jb] Open a workflow workspace breaks the history
- HUE-6760 - [jb] Filtering running jobs and batch killing them do not update their status
- HUE-6767 - [jb] Submit a workflow sometimes generates a JS error
- HUE-6768 - [frontend] Remove shown tooltips on app change in Hue 4
- HUE-6772 - [metadata] Cache Navigator password reads from scripts
- HUE-6776 - [search] Js error when clicking on the first two layout of a new dashboard
- HUE-6777 - [fb] Drag & drop upload not enabled if editor was loaded before
- HUE-6779 - [jb] Disable click around checkbox in list of jobs
- HUE-6780 - [s3] Correctly infer and display region when connected to S3 by endpoint
- HUE-6781 - [jb] Workflow duration time is not humanized
- HUE-6786 - [metastore] Fix ko editable binding after applying toggle overflow to the db description
- HUE-6789 - [beeswax] Fix alignment of the assist headers
- HUE-6791 - [search] Protect against pivot facets conflicting with nested facets
- HUE-6792 - [core] Middle/right click is not working as a link on main blue action button
- HUE-6795 - [fb] Fix minor logic error in trash directory creation
- HUE-6797 - [dashboard] Grid widget operation icons are not grayed much
- HUE-6800 - [frontend] Re-word the welcome tour
- HUE-6802 - [editor] Don't check for column existence in the location handler when a column is prefixed with a table name or alias
- HUE-6810 - [assist] Improve positioning and visibility of side panel toggles
- HUE-6813 - [metastore] Fix database comment last character truncation
- HUE-6814 - [search] Only return distinct usernames in security impersonate dropdown
- HUE-6819 - [oozie] Set generic widget for generic actions in oozie graph
- HUE-6820 - [frontend] Catch all the JS errors and send them to the backend
- HUE-6825 - [autocomplete] Prevent exception when there's no identifierChain given for a table
- HUE-6833 - [editor] Disable query builder tab for non-sql editors
- HUE-6840 - [notebook] Workaround lack of status flag in Impala profile page
- HUE-6842 - [core] Correctly append the list of custom apps
- HUE-6843 - [metastore] Show partition icon on table browser sends to Hue 3
- HUE-6844 - [frontend] Make all custom apps embeddable
- HUE-6845 - [frontend] Fix custom app mako generation
- HUE-6846 - [frontend] Make all proxy apps embeddable
- HUE-6847 - [metastore] Do not display if table is compressed as Hive information is incorrect
- HUE-6856 - [search] Protect against reflected XSS in search query parameters
- HUE-6863 - [jb] Kill buttons are enabled even when the user is not allowed
- HUE-6864 - [jb] Check for permission before killing a job
- HUE-6870 - [jb] Improve end user UX with bubbling up of common actions and properties.
- HUE-6883 - [metastore] Allow deletion of table comments
- HUE-6885 - [frontend] Catch all http 502 and show an HTML stripped version of the message
- HUE-6886 - [oozie] SLA variable in workflow action are not getting retrieved
- HUE-6888 - [oozie] Fix link page on SLA with Hue 4
- HUE-6894 - [core] Remove cluster config call on login page
- HUE-6896 - [frontend] Errorcatcher should not log jQuery triggered events
- HUE-6897 - [oozie] Ensure that examples_dir exists before installing examples
- HUE-6898 - [oozie] Editor doesn't update job status to "finished" for finished Pig jobs
- HUE-6899 - [core] Allow 403 to be accessed by regular users
- HUE-6900 - [useradmin] Do not link app title to list of users if user not a super user
- HUE-6901 - [oozie] Editor is missing certain regular actions
- HUE-6903 - [dataeng] Fix for s3 context popover
- HUE-6905 - [frontend] Minor UI bugs in old Pig Editor and Job Designer.
- HUE-6939 - [editor] Improve interactions with the risk indicator
- HUE-6940 - [editor] Cancel risk requests if editing while risk check is running
- HUE-6941 - [editor] Don't show a pointer cursor for the risk check spinner
- HUE-6950 - [saml] Create home directory for new user login
- HUE-6955 - [metadata] Extend query suggest caching to 1 week
- HUE-6960 - [frontend] Only suggest facet values and no results when autocompleting facets in the top search
- HUE-6995 - [oozie] Set minimum width of a workflow node
- HUE-7000 - [assist] HBase assist does not load the table the second time
- HUE-7001 - [assist] Remember the last selected HBase cluster
- HUE-7002 - [core] Welcome to Hue 4 tour could be also skipped by clicking on the white background
- HUE-7004 - [notebook] Loading query with empty session param shows several JS errors
- HUE-7005 - [oozie] sharing coordinator or bundle immediately after creating fails
- HUE-7006 - [core] Ability in HUE to change Cookie name from default "sessionid"
- HUE-7040 - [oozie] Pressing Esc on share popup throws JS error in Hue4
- HUE-7045 - [hive] Avoid use database call for SET queries
- HUE-7048 - [oozie] User prompting doesn't work in share document popup in hue4
- HUE-7052 - [desktop] Retain last_modified when sharing sample docs in sync_documents
- HUE-7069 - [editor] Remove the spinner for risk checking
- HUE-7080 - [oozie] Oozie HiveServer2 action is prompting for and 'Impalad hostname'
- HUE-7082 - [aws] Create S3 bucket fails with error Bad request
- HUE-7084 - [useradmin] Remove @domain part of username when creating home directory for LDAP user
- HUE-7090 - [security] Protect from missing groupName in the list_sentry_roles_by_group API
- HUE-7104 - [core] Hue and Isilon issue - UnsupportedOperationException: Unknown op 'GETTRASHROOT'
- HUE-7107 - [core] Hue 4 group access doesn't reflect on Interface
- IMPALA-4276 - Profile displays non-default query options set by planner
- IMPALA-4866 - Hash join node does not apply limits correctly
- IMPALA-5354 - INSERT hints for Kudu tables
- IMPALA-5427 - Fix race between CRS::UpdateQueryStatus() and beeswax RPCs
- IMPALA-5500 - Reduce catalog update topic size
- IMPALA-5524 - Fixes NPE during planning with DISABLE_UNSAFE_SPILLS=1
- IMPALA-5539 - Fix Kudu timestamp with -use_local_tz_for_unix_ts
- IMPALA-5554 - sorter DCHECK on null column
- IMPALA-5567 - race in fragment instance teardown
- IMPALA-5579 - Fix IndexOutOfBoundsException in GetTables metadata request
- IMPALA-5580 - fix Java UDFs that return NULL strings
- IMPALA-5582 - Store sentry privileges in lower case
- IMPALA-5586 - Null-aware anti-join can take a long time to cancel
- IMPALA-5588 - Reduce the frequency of fault injection
- IMPALA-5611 - KuduPartitionExpr holds onto memory unnecessarily
- IMPALA-5615 - Fix compute incremental stats for general partition exprs
- IMPALA-5616 - Add --enable_minidumps startup flag
- IMPALA-5623 - Fix lag() on STRING cols to release UDF mem
- IMPALA-5627 - fix dropped statuses in HDFS writers
- IMPALA-5638 - Fix Kudu table set tblproperties inconsistencies
- IMPALA-5657 - Fix a couple of bugs with FunctionCallExpr and IGNORE NULLS
- IMPALA-5686 - Update a mini cluster Sentry property
- IMPALA-5691 - recalibrate mem limit for Q18
- KITE-1155 - Deleting an already deleted empty path should not fail the job
- OOZIE-2816 - Strip out the first command word from Sqoop action if its "sqoop"
- OOZIE-2984 - Parse spark-defaults.conf values with spaces without needing the quotes
- PIG-3567 - LogicalPlanPrinter throws OOM for large scripts
- PIG-3655 - BinStorage and InterStorage approach to record markers is broken
- SENTRY-1646 - Unable to truncate table <database>.<tablename>; from "default" databases
- SENTRY-1811 - Optimize data structures used in HDFS sync
- SENTRY-1827 - Minimize TPathsDump thrift message used in HDFS sync
- SOLR-6673 - MDC based logging of collection, shard, etc.
- SOLR-10889 - Stale zookeeper information is used during failover check.
- SPARK-13278 - [CORE] Launcher fails to start with JDK 9 EA
- SPARK-13669 - [SPARK-20898] [CORE] Improve the blacklist mechanism to handle external shuffle service unavailable situation
- SPARK-15067 - [YARN] YARN executors are launched with fixed perm gen size
- SPARK-16845 - [SQL][BRANCH-1.6] GeneratedClass$SpecificOrdering` grows beyond 64 KB
- SPARK-19019 - [PYTHON] Fix hijacked `collections.namedtuple` and port cloudpickle changes for PySpark to work with Python 3.6.0
- SPARK-19688 - [STREAMING] Not to read `spark.yarn.credentials.file` from checkpoint.
- SPARK-20393 - [WEBU UI] Strengthen Spark to prevent XSS vulnerabilities
- SPARK-20904 - [CORE] Don't report task failures to driver during shutdown.
- ZOOKEEPER-1653 - zookeeper fails to start because of inconsistent epoch
- ZOOKEEPER-2040 - Server to log underlying cause of SASL connection problems.
Want to Get Involved or Learn More?
Check out our other resources
Receive expert Hadoop training through Cloudera University, the industry's only truly dynamic Hadoop training curriculum that’s updated regularly to reflect the state of the art in big data.