Long term component architecture
As the main curator of open standards in Hadoop, Cloudera has a track record of bringing new open source solutions into its platform (such as Apache Spark, Apache HBase, and Apache Parquet) that are eventually adopted by the community at large. As standards, you can build longterm architecture on these components with confidence.
PLEASE NOTE:
With the exception of DSSD support, Cloudera Enterprise 5.6.0 is identical to CDH 5.5.2/Cloudera Manager 5.5.3 If you do not need DSSD support, you do not need to upgrade if you are already using the latest 5.5.x release.
- System Requirements
- What's New
- Documentation
System Requirements
- Supported Operating Systems
- Supported Databases
- Supported JDK Versions
- Supported Browsers
- Supported Internet Protocol
- Supported Transport Layer Security Versions
Supported Operating Systems
CDH 5 provides packages for RHEL-compatible, SLES, Ubuntu, and Debian systems as described below.
Operating System | Version | Packages |
---|---|---|
Red Hat Enterprise Linux (RHEL)-compatible | ||
RHEL | 5.7 | 64-bit |
5.10 | 64-bit | |
6.4 | 64-bit | |
6.5 | 64-bit | |
6.5 in SE Linux mode | 64-bit | |
6.6 | 64-bit | |
6.6 in SE Linux mode | 64-bit | |
6.7 | 64-bit | |
7.1 | 64-bit | |
CentOS | 5.7 | 64-bit |
5.10 | 64-bit | |
6.4 | 64-bit | |
6.5 | 64-bit | |
6.5 in SE Linux mode | 64-bit | |
6.6 | 64-bit | |
6.6 in SE Linux mode | 64-bit | |
6.7 | 64-bit | |
7.1 | 64-bit | |
Oracle Linux with default kernel and Unbreakable Enterprise Kernel | 5.6 (UEK R2) | 64-bit |
6.4 (UEK R2) | 64-bit | |
6.5 (UEK R2, UEK R3) | 64-bit | |
6.6 (UEK R3) | 64-bit | |
7.1 | 64-bit | |
SLES | ||
SUSE Linux Enterprise Server (SLES) | 11 with Service Pack 2 | 64-bit |
SUSE Linux Enterprise Server (SLES) | 11 with Service Pack 3 | 64-bit |
Ubuntu/Debian | ||
Ubuntu | Precise (12.04) - Long-Term Support (LTS) | 64-bit |
Trusty (14.04) - Long-Term Support (LTS) | 64-bit | |
Debian | Wheezy (7.0, 7.1) | 64-bit |
Note:
- CDH 5 provides only 64-bit packages.
- Cloudera has received reports that RPMs work well on Fedora, but this has not been tested.
- If you are using an operating system that is not supported by Cloudera packages, you can also download source tarballs from Downloads.
Important: Cloudera Enterprise is supported on platforms with Security-Enhanced Linux (SELinux) enabled. However, policies need to be provided by other parties or created by the administrator of the cluster deployment. Cloudera is not responsible for policy support nor policy enforcement, nor for any issues with such. If you experience issues with SELinux, contact your OS support provider.
Important: Cloudera supports RHEL 7 with the following limitations:
- Only RHEL 7.1 is supported. RHEL 7.0 is not supported.
- Only a new installation of RHEL 7.1 is supported. Upgrades from RHEL 6 to RHEL 7.1 are not supported. For more information, see Does Red Hat support upgrades between major versions of Red Hat Enterprise Linux?
- Navigator Encrypt is not supported on RHEL 7.1.
Supported Databases
Component | MariaDB | MySQL | SQLite | PostgreSQL | Oracle | Derby - see Note 6 |
---|---|---|---|---|---|---|
Oozie | 5.5 | 5.5, 5.6 | – | 9.2, 9.3, 9.4 See Note 3 |
11gR2, 12c | Default |
Flume | – | – | – | – | – | Default (for the JDBC Channel only) |
Hue | 5.5 | 5.1, 5.5, 5.6 See Note 7 |
Default | 9.2, 9.3, 9.4 See Note 3 |
11gR2, 12c | – |
Hive/Impala | 5.5 | 5.5, 5.6 See Note 1 |
– | 9.2, 9.3, 9.4 See Note 3 |
11gR2, 12c | Default |
Sentry | 5.5 | 5.5, 5.6 See Note 1 |
– | 9.2, 9.3, 9.4 See Note 3 |
11gR2, 12c | – |
Sqoop 1 | 5.5 | See Note 4 | – | See Note 4 | See Note 4 | – |
Sqoop 2 | 5.5 | See Note 5 | – | See Note 5 | See Note 5 | Default |
Note:
- MySQL 5.5 is supported on CDH 5.1. MySQL 5.6 is supported on CDH 5.1 and higher. The InnoDB storage engine must be enabled in the MySQL server.
- Cloudera Manager installation fails if GTID-based replication is enabled in MySQL.
- PostgreSQL 9.2 is supported on CDH 5.1 and higher. PostgreSQL 9.3 is supported on CDH 5.2 and higher. PostgreSQL 9.4 is supported on CDH 5.5 and higher.
- For purposes of transferring data only, Sqoop 1 supports MySQL 5.0 and above, PostgreSQL 8.4 and above, Oracle 10.2 and above, Teradata 13.10 and above, and Netezza TwinFin 5.0 and above. The Sqoop metastore works only with HSQLDB (1.8.0 and higher 1.x versions; the metastore does not work with any HSQLDB 2.x versions).
- Sqoop 2 can transfer data to and from MySQL 5.0 and above, PostgreSQL 8.4 and above, Oracle 10.2 and above, and Microsoft SQL Server 2012 and above. The Sqoop 2 repository database is supported only on Derby and PostgreSQL.
- Derby is supported as shown in the table, but not always recommended. See the pages for individual components in the Cloudera Installation and Upgrade guide for recommendations.
- CDH 5 Hue requires the default MySQL version of the operating system on which it is being installed, which is usually MySQL 5.1, 5.5, or 5.6.
Supported JDK Versions
Important: There is one exception to the minimum supported and recommended JDK versions in the following table. If Oracle releases a security patch that affects server-side Java before the next minor release of Cloudera products, the Cloudera support policy covers customers using the patch.
CDH 5.5.x is supported with the versions shown in the following table:
Minimum Supported Version | Recommended Version | Exceptions |
---|---|---|
1.7.0_25 | 1.7.0_80 | None |
1.8.0_31 | 1.8.0_60 | Cloudera recommends that you not use JDK 1.8.0_40. |
Supported Browsers
Hue
Hue works with the two most recent versions of the following browsers. Cookies and JavaScript must be on.
- Chrome
- Firefox
- Safari (not supported on Windows)
- Internet Explorer
Supported Internet Protocol
CDH requires IPv4. IPv6 is not supported.
See also Configuring Network Names.
Supported Transport Layer Security Versions
The following components are supported by the indicated versions of Transport Layer Security (TLS):
Component |
Role | Name | Port | Version |
---|---|---|---|---|
Flume | Avro Source/Sink | 9099 | TLS 1.2 | |
HBase | Master | HBase Master Web UI Port | 60010 | TLS 1.2 |
HDFS | NameNode | Secure NameNode Web UI Port | 50470 | TLS 1.2 |
HDFS | Secondary NameNode | Secure Secondary NameNode Web UI Port | 50495 | TLS 1.2 |
HDFS | HttpFS | REST Port | 14000 | TLS 1.0 |
Hive | HiveServer2 | HiveServer2 Port | 10000 | TLS 1.2 |
Hue | Hue Server | Hue HTTP Port | 8888 | TLS 1.2 |
Cloudera Impala | Impala Daemon | Impala Daemon Beeswax Port | 21000 | TLS 1.2 |
Cloudera Impala | Impala Daemon | Impala Daemon HiveServer2 Port | 21050 | TLS 1.2 |
Cloudera Impala | Impala Daemon | Impala Daemon Backend Port | 22000 | TLS 1.2 |
Cloudera Impala | Impala Daemon | Impala Daemon HTTP Server Port | 25000 | TLS 1.2 |
Cloudera Impala | Impala StateStore | StateStore Service Port | 24000 | TLS 1.2 |
Cloudera Impala | Impala StateStore | StateStore HTTP Server Port | 25010 | TLS 1.2 |
Cloudera Impala | Impala Catalog Server | Catalog Server HTTP Server Port | 25020 | TLS 1.2 |
Cloudera Impala | Impala Catalog Server | Catalog Server Service Port | 26000 | TLS 1.2 |
Oozie | Oozie Server | Oozie HTTPS Port | 11443 | TLS 1.1, TLS 1.2 |
Solr | Solr Server | Solr HTTP Port | 8983 | TLS 1.1, TLS 1.2 |
Solr | Solr Server | Solr HTTPS Port | 8985 | TLS 1.1, TLS 1.2 |
YARN | ResourceManager | ResourceManager Web Application HTTP Port | 8090 | TLS 1.2 |
YARN | JobHistory Server | MRv1 JobHistory Web Application HTTP Port | 19890 | TLS 1.2 |
What's New
Issues Fixed in CDH 5.5.6
Upstream Issues Fixed
The following upstream issues are fixed in CDH 5.5.6:
- FLUME-2797 - Use SourceCounter for SyslogTcpSource
- FLUME-2844 - SpillableMemoryChannel must start ChannelCounter
- HADOOP-10300 - Allow deferred sending of call responses
- HADOOP-11031 - Design Document for Credential Provider API
- HADOOP-12453 - Support decoding KMS Delegation Token with its own Identifier
- HADOOP-12483 - Maintain wrapped SASL ordering for postponed IPC responses
- HADOOP-12537 - S3A to support Amazon STS temporary credentials
- HADOOP-12548 - Read s3a credentials from a Credential Provider
- HADOOP-12609 - Fix intermittent failure of TestDecayRpcScheduler.
- HADOOP-12723 - S3A: Add ability to plug in any AWSCredentialsProvider
- HADOOP-12749 - Create a threadpoolexecutor that overrides afterExecute to log uncaught exceptions/errors
- HADOOP-13034 - Log message about input options in distcp lacks some items
- HADOOP-13317 - Add logs to KMS server-side to improve supportability
- HADOOP-13353 - LdapGroupsMapping getPassword should not return null when IOException throws
- HADOOP-13487 - Hadoop KMS should load old delegation tokens from ZooKeeper on startup
- HADOOP-13526 - Add detailed logging in KMS for the authentication failure of proxy user
- HADOOP-13558 - UserGroupInformation created from a Subject incorrectly tries to renew the Kerberos ticket
- HADOOP-13638 - KMS should set UGI's Configuration object properly
- HADOOP-13669 - KMS Server should log exceptions before throwing
- HADOOP-13693 - Remove the message about HTTP OPTIONS in SPNEGO initialization message from kms audit log.
- HDFS-4176 - EditLogTailer should call rollEdits with a timeout.
- HDFS-6962 - ACLs inheritance conflict with umaskmode
- HDFS-7210 - Avoid two separate RPC's namenode.append() and namenode.getFileInfo() for an append call from DFSClient
- HDFS-7413 - Some unit tests should use NameNodeProtocols instead of FSNameSystem
- HDFS-7415 - Move FSNameSystem.resolvePath() to FSDirectory
- HDFS-7420 - Delegate permission checks to FSDirectory
- HDFS-7463 - Simplify FSNamesystem#getBlockLocationsUpdateTimes
- HDFS-7478 - Move org.apache.hadoop.hdfs.server.namenode.NNConf to FSNamesystem
- HDFS-7517 - Remove redundant non-null checks in FSNamesystem#getBlockLocations
- HDFS-7964 - Add support for async edit logging
- HDFS-8224 - Schedule a block for scanning if its metadata file is corrupt
- HDFS-8269 - getBlockLocations() does not resolve the .reserved path and generates incorrect edit logs when updating the atime
- HDFS-8709 - Clarify automatic sync in FSEditLog#logEdit
- HDFS-9106 - Transfer failure during pipeline recovery causes permanent write failures
- HDFS-9290 - DFSClient#callAppend() is not backward compatible for slightly older NameNodes
- HDFS-9428 - Fix intermittent failure of TestDNFencing.testQueueingWithAppend
- HDFS-9549 - TestCacheDirectives#testExceedsCapacity is unreliable
- HDFS-9630 - DistCp minor refactoring and clean up
- HDFS-9638 - Improve DistCp Help and documentation
- HDFS-9764 - DistCp does not print value for several arguments including -numListstatusThreads.
- HDFS-9781 - FsDatasetImpl#getBlockReports can occasionally throw NullPointerException
- HDFS-9820 - Improve distcp to support efficient restore to an earlier snapshot
- HDFS-9906 - Remove excessive unnecessary log output when a DataNode is restarted
- HDFS-9958 - BlockManager#createLocatedBlocks can throw NPE for corruptBlocks on failed storages
- HDFS-10178 - Permanent write failures can happen if pipeline recoveries occur for the first packet
- HDFS-10216 - Distcp -diff throws exception when handling relative path
- HDFS-10270 - TestJMXGet:testNameNode() fails
- HDFS-10271 - Extra bytes are getting released from reservedSpace for append
- HDFS-10298 - Document the usage of distcp -diff option
- HDFS-10313 - Distcp need to enforce the order of snapshot names passed to -diff
- HDFS-10397 - Distcp should ignore -delete option if -diff option is provided instead of exiting
- HDFS-10457 - DataNode should not auto-format block pool directory if VERSION is missing.
- HDFS-10525 - Fix NPE in CacheReplicationMonitor#rescanCachedBlockMap
- HDFS-10556 - DistCpOptions should be validated automatically
- HDFS-10609 - Uncaught InvalidEncryptionKeyException during pipeline recovery can abort downstream applications
- HDFS-10722 - Fix race condition in TestEditLog#testBatchedSyncWithClosedLogs
- HDFS-10760 - DataXceiver#run() should not log InvalidToken exception as an error
- HDFS-10822 - Log DataNodes in the write pipeline. John Zhuge via Lei Xu
- HDFS-10879 - TestEncryptionZonesWithKMS#testReadWrite fails intermittently
- HDFS-10962 - TestRequestHedgingProxyProvider is unreliable
- HDFS-11012 - Unnecessary INFO logging on DFSClients for InvalidToken
- HDFS-11040 - Add documentation for HDFS-9820 distcp improvement
- HDFS-11056 - Concurrent append and read operations lead to checksum error
- MAPREDUCE-6359 - In RM HA setup, Cluster tab links populated with AM hostname instead of RM
- MAPREDUCE-6473 - Job submission can take a long time during Cluster initialization
- MAPREDUCE-6635 - Unsafe long to int conversion in UncompressedSplitLineReader and IndexOutOfBoundsException
- MAPREDUCE-6680 - JHS UserLogDir scan algorithm sometime could skip directory with update in CloudFS (Azure FileSystem, S3, etc
- MAPREDUCE-6684 - High contention on scanning of user directory under immediate_done in Job History Server
- MAPREDUCE-6761 - Regression when handling providers - invalid configuration ServiceConfiguration causes Cluster initialization failure
- MAPREDUCE-6771 - RMContainerAllocator sends container diagnostics event after corresponding completion event
- YARN-3495 - Confusing log generated by FairScheduler
- YARN-4004 - container-executor should print output of docker logs if the docker container exits with non-0 exit status
- YARN-4017 - container-executor overuses PATH_MAX
- YARN-4245 - Generalize config file handling in container-executor
- YARN-4255 - container-executor does not clean up Docker operation command files
- YARN-5608 - TestAMRMClient.setup() fails with ArrayOutOfBoundsException
- YARN-5704 - Provide config knobs to control enabling/disabling new/work in progress features in container-executor
- HBASE-13330 - Region left unassigned due to AM & SSH each thinking the assignment would be done by the other
- HBASE-14241 - Fix deadlock during cluster shutdown due to concurrent connection close
- HBASE-14313 - After a Connection sees ConnectionClosingException on a connection it never recovers
- HBASE-14407 - NotServingRegion: hbase region closed forever
- HBASE-14449 - Rewrite deadlock prevention for concurrent connection close
- HBASE-14474 - Addendum closes connection in writeRequest() outside synchronized block
- HBASE-14474 - DeadLock in RpcClientImpl.Connection.close()
- HBASE-14578 - URISyntaxException during snapshot restore for table with user defined namespace
- HBASE-14968 - ConcurrentModificationException in region close resulting in the region staying in closing state
- HBASE-15430 - Failed taking snapshot - Manifest proto-message too large
- HBASE-15856 - Addendum Fix UnknownHostException import in MetaTableLocator
- HBASE-15856 - Do not cache unresolved addresses for connections
- HBASE-16350 - Undo server abort from HBASE-14968
- HBASE-16360 - TableMapReduceUtil addHBaseDependencyJars has the wrong class name for PrefixTreeCodec
- HBASE-16767 - Mob compaction needs to clean up files in /hbase/mobdir/.tmp and /hbase/mobdir/.tmp/.bulkload when running into IO exceptions
- HIVE-10384 - BackportRetryingMetaStoreClient does not retry wrapped TTransportExceptions
- HIVE-10728 - Deprecate unix_timestamp(void) and make it deterministic
- HIVE-11579 - Invoke the set command will close standard error output[beeline-cli]
- HIVE-11768 - java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances
- HIVE-11901 - StorageBasedAuthorizationProvider requires write permission on table for SELECT statements
- HIVE-12475 - Parquet schema evolution within array<struct<>> does not work
- HIVE-12891 - Hive fails when java.io.tmpdir is set to a relative location
- HIVE-13058 - Add session and operation_log directory deletion messages
- HIVE-13090 - Hive metastore crashes on NPE with ZooKeeperTokenStore
- HIVE-13129 - CliService leaks HMS connection
- HIVE-13198 - Authorization issues with cascading views
- HIVE-13237 - Select parquet struct field with upper case throws NPE
- HIVE-13429 - Tool to remove dangling scratch directory
- HIVE-13997 - Insert overwrite directory does not overwrite existing files
- HIVE-14296 - Session count is not decremented when HS2 clients do not shutdown cleanly
- HIVE-14421 - FS.deleteOnExit holds references to _tmp_space.db files
- HIVE-14436 - Hive 1.2.1/Hitting "ql.Driver: FAILED: IllegalArgumentException Error: , expected at the end of 'decimal(9'" after enabling hive.optimize.skewjoin and with MR engine
- HIVE-14457 - Partitions in encryption zone are still trashed though an exception is returned
- HIVE-14743 - ArrayIndexOutOfBoundsException - HBASE-backed views' query with JOINs
- HIVE-14762 - Add logging while removing scratch space
- HIVE-14805 - Subquery inside a view will have the object in the subquery as the direct input
- HIVE-14817 - Shut down the SessionManager timeoutChecker thread properly upon shutdown
- HIVE-15090 - Temporary DB failure can stop ExpiredTokenRemover thread
- HUE-4804 - [search] Download function of HTML widget breaks the display
- HUE-4968 - [oozie] Remove access to /oozie/import_wokflow when v2 is enabled
- IMPALA-1928 - Fix Thrift client transport wrapping order
- IMPALA-3369 - Add ALTER TABLE SET COLUMN STATS statement.
- IMPALA-3378 - Fix a bug introduced with backport of IMPALA-3379
- IMPALA-3378 - Fix various JNI issues
- IMPALA-3441 - Check for malformed Avro data
- IMPALA-3499 - Split catalog update
- IMPALA-3575 - Add retry to backend connection request and rpc timeout
- IMPALA-3633 - Cancel fragment if coordinator is gone
- IMPALA-3682 - Do not retry unrecoverable socket creation errors
- IMPALA-3687 - Prefer Avro field name during schema reconciliation
- IMPALA-3698 - Fix Isilon permissions test
- IMPALA-3711 - Remove unnecessary privilege checks in getDbsMetadata()
- IMPALA-3732 - Handle string length overflow in Avro files
- IMPALA-3751 - Fix clang build errors and warnings
- IMPALA-3915 - Register privilege and audit requests when analyzing resolved table refs.
- IMPALA-4135 - Thrift threaded server times-out connections during high load
- IMPALA-4153 - Return valid non-NULL pointer for 0-byte allocations
- OOZIE-1814 - Oozie should mask any passwords in logs and REST interfaces
- OOZIE-2068 - Configuration as part of sharelib
- OOZIE-2347 - Remove unnecessary new Configuration()/new jobConf() calls from Oozie
- OOZIE-2555 - Oozie SSL enable setup does not return port for admin -servers
- OOZIE-2567 - HCat connection is not closed while getting hcat cred
- OOZIE-2589 - CompletedActionXCommand is hardcoded to wrong priority
- OOZIE-2649 - Cannot override sub-workflow configuration property if defined in parent workflow XML
- PIG-3569 - SUM function for BigDecimal and BigInteger
- PIG-3818 - PIG-2499 is accidentally reverted
- SENTRY-1095 - Insert into requires URI privilege on partition location under table.
- SOLR-7280 - BackportLoad cores in sorted order and tweak coreLoadThread counts to improve cluster stability on restarts
- SOLR-8407 - Backport
- SOLR-8586 - Add index fingerprinting and use it in peersync
- SOLR-8690 - Add solr.disableFingerprint system property
- SOLR-8691 - Cache index fingerprints per searcher
- SOLR-9310 - SOLR-9524
- SPARK-17644 - [CORE] Do not add failedStages when abortStage for fetch failure
- SQOOP-2387 - Sqoop should support importing from table with column names containing some special character
- SQOOP-2864 - ClassWriter chokes on column names containing double quotes
- SQOOP-2880 - Provide argument for overriding temporary directory
- SQOOP-2884 - Document --temporary-rootdir
- SQOOP-2906 - Optimization of AvroUtil.toAvroIdentifier
- SQOOP-2915 - Fixing Oracle related unit tests
- SQOOP-2920 - Sqoop performance deteriorates significantly on wide datasets; Sqoop 100% on CPU
- SQOOP-2952 - Fixing bug row key not added into column family using --hbase-bulkload
- SQOOP-2971 - OraOop does not close connections properly
- SQOOP-2983 - OraOop export has degraded performance with wide tables
- SQOOP-2986 - Add validation check for --hive-import and --incremental lastmodified
- SQOOP-3021 - ClassWriter fails if a column name contains a backslash character
- SQOOP-3034 - HBase import should fail fast if using anything other than as-textfile
Documentation
Want to Get Involved or Learn More?
Check out our other resources
Cloudera Community
Collaborate with your peers, industry experts, and Clouderans to make the most of your investment in Hadoop.
Cloudera University
Receive expert Hadoop training through Cloudera University, the industry's only truly dynamic Hadoop training curriculum that’s updated regularly to reflect the state of the art in big data.