Long term component architecture
As the main curator of open standards in Hadoop, Cloudera has a track record of bringing new open source solutions into its platform (such as Apache Spark, Apache HBase, and Apache Parquet) that are eventually adopted by the community at large. As standards, you can build longterm architecture on these components with confidence.
With the exception of DSSD support, Cloudera Enterprise 5.6.0 is identical to CDH 5.5.2/Cloudera Manager 5.5.3 If you do not need DSSD support, you do not need to upgrade if you are already using the latest 5.5.x release.
- System Requirements
- What's New
- Supported Operating Systems
- Supported Databases
- Supported JDK Versions
- Supported Internet Protocol
Supported Operating Systems
CDH 5 provides packages for Red-Hat-compatible, SLES, Ubuntu, and Debian systems as described below.
|Red Hat Enterprise Linux (RHEL)-compatible|
|Red Hat Enterprise Linux||5.7||64-bit|
|6.5 in SE Linux mode||64-bit|
|6.5 in SE Linux mode||64-bit|
|Oracle Linux with default kernel and Unbreakable Enterprise Kernel||5.6 (UEK R2)||64-bit|
|6.4 (UEK R2)||64-bit|
|6.5 (UEK R2, UEK R3)||64-bit|
|6.6 (UEK R3)||64-bit|
|SUSE Linux Enterprise Server (SLES)||11 with Service Pack 2||64-bit|
|SUSE Linux Enterprise Server (SLES)||11 with Service Pack 3||64-bit|
|Ubuntu||Precise (12.04) - Long-Term Support (LTS)||64-bit|
|Trusty (14.04) - Long-Term Support (LTS)||64-bit|
- CDH 5 provides only 64-bit packages.
- Cloudera has received reports that our RPMs work well on Fedora, but we have not tested this.
- If you are using an operating system that is not supported by Cloudera packages, you can also download source tarballs from Downloads.
|Component||MySQL||SQLite||PostgreSQL||Oracle||Derby - see Note 4|
|Oozie||5.5, 5.6||-||8.4, 9.2, 9.3
See Note 2
|Flume||-||-||-||-||Default (for the JDBC Channel only)|
See Note 1
|Default||8.4, 9.2, 9.3
See Note 2
See Note 1
|-||8.4, 9.2, 9.3
See Note 2
See Note 1
|-||8.4, 9.2, 9.3
See Note 2
|Sqoop 1||See Note 3||-||See Note 3||See Note 3||-|
|Sqoop 2||See Note 4||-||See Note 4||See Note 4||Default|
- MySQL 5.5 is supported on CDH 5.1. MySQL 5.6 is supported on CDH 5.1 and later. The InnoDB storage engine must be enabled in the MySQL server.
- PostgreSQL 9.2 is supported on CDH 5.1 and later. PostgreSQL 9.3 is supported on CDH 5.2 and later.
- For the purposes of transferring data only, Sqoop 1 supports MySQL 5.0 and above, PostgreSQL 8.4 and above, Oracle 10.2 and above, Teradata 13.10 and above, and Netezza TwinFin 5.0 and above. The Sqoop metastore works only with HSQLDB (1.8.0 and higher 1.x versions; the metastore does not work with any HSQLDB 2.x versions).
- Sqoop 2 can transfer data to and from MySQL 5.0 and above, PostgreSQL 8.4 and above, Oracle 10.2 and above, and Microsoft SQL Server 2012 and above. The Sqoop 2 repository database is supported only on Derby and PostgreSQL.
- Derby is supported as shown in the table, but not always recommended. See the pages for individual components in the Cloudera Installation and Upgrade guide for recommendations.
Supported JDK Versions
|Minimum Supported Version||Recommended Version||Notes|
|1.7.0_55||1.7.0_67 or 1.7.0_75||None|
Supported Internet Protocol
Issues Fixed in CDH 5.4.9
The following topics describe known issues fixed in CDH 5.4.9.
Apache Commons Collections Deserialization Vulnerability
Cloudera has learned of a potential security vulnerability in a third-party library called the Apache Commons Collections. This library is used in products distributed and supported by Cloudera (“Cloudera Products”), including core Apache Hadoop. The Apache Commons Collections library is also in widespread use beyond the Hadoop ecosystem. At this time, no specific attack vector for this vulnerability has been identified as present in Cloudera Products.
In an abundance of caution, we are currently in the process of incorporating a version of the Apache Commons Collections library with a fix into the Cloudera Products. In most cases, this will require coordination with the projects in the Apache community. One example of this is tracked by HADOOP-12577.
The Apache Commons Collections potential security vulnerability is titled “Arbitrary remote code execution with InvokerTransformer” and is tracked by COLLECTIONS-580. MITRE has not issued a CVE, but related CVE-2015-4852 has been filed for the vulnerability. CERT has issued Vulnerability Note #576313 for this issue.
Releases affected: CDH 5.5.0, CDH 5.4.8 and lower, Cloudera Manager 5.5.0, Cloudera Manager 5.4.8 and lower, Cloudera Navigator 2.4.0, Cloudera Navigator 2.3.8 and lower
Users affected: All
Severity (Low/Medium/High): High
Impact: This potential vulnerability may enable an attacker to execute arbitrary code from a remote machine without requiring authentication.
Immediate action required: Upgrade to Cloudera Manager 5.5.1 and CDH 5.5.1.
Upstream Issues Fixed
The following upstream issues are fixed in CDH 5.4.9:
- FLUME-2841 - Upgrade commons-collections to 3.2.2
- HADOOP-7713 - dfs -count -q should label output column
- HADOOP-11171 - Enable using a proxy server to connect to S3a
- HADOOP-12568 - Update core-default.xml to describe posixGroups support
- HADOOP-12577 - Bumped up commons-collections version to 3.2.2 to address a security flaw
- HDFS-7785 - Improve diagnostics information for HttpPutFailedException
- HDFS-7798 - Checkpointing failure caused by shared KerberosAuthenticator
- HDFS-7871 - NameNodeEditLogRoller can keep printing 'Swallowing exception' message
- HDFS-7990 - IBR delete ack should not be delayed
- HDFS-8646 - Prune cached replicas from DatanodeDescriptor state on replica invalidation
- HDFS-9123 - Copying from the root to a subdirectory should be forbidden
- HDFS-9250 - Add Precondition check to LocatedBlock#addCachedLoc
- HDFS-9273 - ACLs on root directory may be lost after NN restart
- HDFS-9332 - Fix Precondition failures from NameNodeEditLogRoller while saving namespace
- HDFS-9364 - Unnecessary DNS resolution attempts when creating NameNodeProxies
- HDFS-9470 - Encryption zone on root not loaded from fsimage after NN restart
- MAPREDUCE-6191 - Improve clearing stale state of Java serialization
- MAPREDUCE-6549 - Multibyte delimiters with LineRecordReader cause duplicate records
- YARN-4235 - FairScheduler PrimaryGroup does not handle empty groups returned for a user
- HBASE-6617 - ReplicationSourceManager should be able to track multiple WAL paths
- HBASE-12865 - WALs may be deleted before they are replicated to peers
- HBASE-13134 - mutateRow and checkAndMutate APIs don't throw region level exceptions
- HBASE-13618 - ReplicationSource is too eager to remove sinks.
- HBASE-13703 - ReplicateContext should not be a member of ReplicationSource.
- HBASE-14003 - Work around JDK-8044053
- HBASE-14283 - Reverse scan doesn’t work with HFile inline index/bloom blocks
- HBASE-14374 - Backport parent 'HBASE-14317 Stuck FSHLog' issue to 1.1
- HBASE-14501 - NPE in replication with TDE
- HBASE-14533 - Connection Idle time 1 second is too short and the connection is closed too quickly by the ChoreService
- HBASE-14547 - Add more debug/trace to zk-procedure
- HBASE-14799 - Commons-collections object deserialization remote command execution vulnerability
- HBASE-14809 - Grant / revoke Namespace admin permission to group
- HIVE-7575 - Revert "GetTables thrift call is very slow
- HIVE-7575 - GetTables thrift call is very slow
- HIVE-10265 - Hive CLI crashes on != inequality
- HIVE-11149 - Sometimes HashMap in PerfLogger.java hangs
- HIVE-11616 - DelegationTokenSecretManager reuses the same objectstore, which has concurrency issues
- HIVE-12058 - Change hive script to record errors when calling hbase fails
- HIVE-12188 - DoAs does not work properly in non-Kerberos secured HS2
- HIVE-12189 - The list in pushdownPreds of ppd.ExprWalkerInfo should not be allowed to grow very large
- HIVE-12250 - ZooKeeper connection leaks in Hive's HBaseHandler
- HIVE-12365 - Added resource path is sent to cluster as an empty string when externally removed
- HIVE-12378 - Exception on HBaseSerDe.serialize binary field
- HIVE-12406 - HIVE-9500 introduced incompatible change to LazySimpleSerDe public interface
- HIVE-12418 - HiveHBaseTableInputFormat.getRecordReader() causes ZooKeeper connection leak
- HUE-2941 - [hadoop] Cache the active RM HA
- HUE-3035 - [beeswax] Optimize sample data query for partitioned tables
- IMPALA-1459 - Fix migration/assignment of On-clause predicates inside inline views.
- IMPALA-1675 - Avoid overflow when adding large intervals to TIMESTAMPs
- IMPALA-1746 - QueryExecState doesn't check for query cancellation or errors
- IMPALA-1949 - Analysis exception when a binary operator contain an IN operator with values
- IMPALA-2086/IMPALA-2090 - Avoid boost year/month interval logic
- IMPALA-2141 - UnionNode::GetNext() doesn't check for query errors
- IMPALA-2252 - Crash (likely race) tearing down BufferedBlockMgr on query failure
- IMPALA-2260 - Adding a large hour interval caused an interval overflow
- IMPALA-2265 - Sorter was not checking the returned Status of PrepareRead
- IMPALA-2273 - Make MAX_PAGE_HEADER_SIZE configurable
- IMPALA-2286 - Fix race between ~BufferedBlockMgr() and BufferedBlockMgr::Create()
- IMPALA-2344 - Work-around IMPALA-2344 Fail query with OOM in case block->Pin() fails
- IMPALA-2357 - Fix spilling sorts with var-len slots that are NULL or empty.
- IMPALA-2446 - Fix wrong predicate assignment in outer joins
- IMPALA-2533 - Impala throws IllegalStateException when inserting data into a partition
- IMPALA-2559 - Fix check failed: sorter_runs_.back()->is_pinned_
- IMPALA-2664 - Avoid sending large partition stats objects over thrift
- KITE-1089 - readAvroContainer morphline command should work even if the Avro writer schema of each input file is different
- PIG-3641 - Split "otherwise" producing incorrect output when combined with ColumnPruning
- SENTRY-565 - Improve performance of filtering Hive SHOW commands
- SENTRY-702 - Hive binding should support RELOAD command
- SENTRY-936 - getGroup and getUser should always return orginal hdfs values for paths in prefixes which are not Sentry managed
- SENTRY-960 - Blacklist reflect, java_method using hive.server2.builtin.udf.blacklist
- SOLR-6443 - backportDisable test that fails on Jenkins with SolrCore.getOpenCount()==2
- SOLR-7049 - LIST Collections API call should be processed directly by the CollectionsHandler instead of the OverseerCollectionProcessor
- SOLR-7552 - Support using ZkCredentialsProvider/ZkACLProvider in custom filter
- SOLR-7989 - After a new leader is elected, it should ensure it's state is ACTIVE if it has already registered with ZK
- SOLR-8075 - Leader Initiated Recovery should not stop a leader that participated in an election with all of it's replicas from becoming a valid leader
- SOLR-8223 - Avoid accidentally swallowing OutOfMemoryError
- SOLR-8288 - DistributedUpdateProcessor#doFinish should explicitly check and ensure it does not try to put itself into LIR
- SPARK-11484 - [WEBUI] Using proxyBase set by spark AM
- SPARK-11652 - [CORE] Remote code execution with InvokerTransformer
Want to Get Involved or Learn More?
Check out our other resources
Cloudera Educational Services
Receive expert Hadoop training through Cloudera Educational Services, the industry's only truly dynamic Hadoop training curriculum that’s updated regularly to reflect the state of the art in big data.