Long term component architecture
As the main curator of open standards in Hadoop, Cloudera has a track record of bringing new open source solutions into its platform (such as Apache Spark, Apache HBase, and Apache Parquet) that are eventually adopted by the community at large. As standards, you can build longterm architecture on these components with confidence.
With the exception of DSSD support, Cloudera Enterprise 5.6.0 is identical to CDH 5.5.2/Cloudera Manager 5.5.3 If you do not need DSSD support, you do not need to upgrade if you are already using the latest 5.5.x release.
- System Requirements
- What's New
- Supported Operating Systems
- Supported Databases
- Supported JDK Versions
- Supported Internet Protocol
Supported Operating Systems
|Component||MySQL||SQLite||PostgreSQL||Oracle||Derby - see Note 4|
|Oozie||5.5, 5.6||-||8.4, 9.1, 9.2, 9.3
See Note 2
|Flume||-||-||-||-||Default (for the JDBC Channel only)|
See Note 1
|Default||8.4, 9.1, 9.2, 9.3
See Note 2
See Note 1
|-||8.4, 9.1, 9.2, 9.3
See Note 2
See Note 1
|-||8.4, 9.1, 9.2,, 9.3
See Note 2
|Sqoop 1||See Note 3||-||See Note 3||See Note 3||-|
|Sqoop 2||See Note 4||-||See Note 4||See Note 4||Default|
- MySQL 5.5 is supported on CDH 5.1. MySQL 5.6 is supported on CDH 5.1 and later.
- PostgreSQL 9.2 is supported on CDH 5.1 and later. PostgreSQL 9.3 is supported on CDH 5.2 and later.
- For the purposes of transferring data only, Sqoop 1 supports MySQL 5.0 and above, PostgreSQL 8.4 and above, Oracle 10.2 and above, Teradata 13.10 and above, and Netezza TwinFin 5.0 and above. The Sqoop metastore works only with HSQLDB (1.8.0 and higher 1.x versions; the metastore does not work with any HSQLDB 2.x versions).
- Sqoop 2 can transfer data to and from MySQL 5.0 and above, PostgreSQL 8.4 and above, Oracle 10.2 and above, and Microsoft SQL Server 2012 and above. The Sqoop 2 repository database is supported only on Derby.
- Derby is supported as shown in the table, but not always recommended. See the pages for individual components in the Cloudera Installation and Upgrade guide for recommendations.
Supported JDK Versions
CDH 5 is supported with the versions shown in the table that follows.
Table 1. Supported JDK Versions
|Latest Certified Version||Minimum Supported Version||Exceptions|
Supported Internet Protocol
Upstream Issues Fixed
The following upstream issues are fixed in CDH 5.3.2:
- AVRO-1630 - Creating Builder from instance loses data
- AVRO-1628 - Add Schema.createUnion(Schema... type)
- AVRO-1539 - Add FileSystem-based FsInput Constructor
- AVRO-1623 - GenericData#validate() of enum: IndexOutOfBoundsException
- AVRO-1614 - Always getting a value...
- AVRO-1592 - Java keyword as an enum constant in Avro schema file causes deserialization to fail.
- AVRO-1619 - Generate better JavaDoc
- AVRO-1622 - Add missing license headers
- AVRO-1604 - ReflectData.AllowNull fails to generate schemas when @Nullable is present.
- AVRO-1407 - NettyTransceiver can cause a infinite loop when slow to connect
- AVRO-834 - Data File corruption recovery tool
- AVRO-1596 - Cannot read past corrupted block in Avro data file
- HADOOP-11350 - The size of header buffer of HttpServer is too small when HTTPS is enabled
- HDFS-7707 - Edit log corruption due to delayed block removal again
- HDFS-7718 - Store KeyProvider in ClientContext to avoid leaking key provider threads when using FileContext
- HDFS-6425 - Large postponedMisreplicatedBlocks has impact on blockReport latency
- HDFS-7560 - ACLs removed by removeDefaultAcl() will be back after NameNode restart/failover
- HDFS-7513 - HDFS inotify: add defaultBlockSize to CreateEvent
- HDFS-7158 - Reduce the memory usage of WebImageViewer
- HDFS-7497 - Inconsistent report of decommissioning DataNodes between dfsadmin and NameNode webui
- HDFS-6917 - Add an hdfs debug command to validate blocks, call recoverlease, etc.
- HDFS-6779 - Add missing version subcommand for hdfs
- YARN-2697 - RMAuthenticationHandler is no longer useful
- YARN-2656 - RM web services authentication filter should add support for proxy user
- YARN-3082 - Non thread safe access to systemCredentials in NodeHeartbeatResponse processing
- YARN-3079 - Scheduler should also update maximumAllocation when updateNodeResource.
- YARN-2992 - ZKRMStateStore crashes due to session expiry
- YARN-2675 - containersKilled metrics is not updated when the container is killed during localization
- YARN-2715 - Proxy user is problem for RPC interface if yarn.resourcemanager.webapp.proxyuser is not set.
- MAPREDUCE-6198 - NPE from JobTracker#resolveAndAddToTopology in MR1 cause initJob and heartbeat failure.
- MAPREDUCE-6196 - Fix BigDecimal ArithmeticException in PiEstimator
- HBASE-12540 - TestRegionServerMetrics#testMobMetrics test failure
- HBASE-12533 - staging directories are not deleted after secure bulk load
- HBASE-12077 - FilterLists create many ArrayList$Itr objects per row.
- HBASE-12386 - Replication gets stuck following a transient zookeeper error to remote peer cluster
- HBASE-11979 - Compaction progress reporting is wrong
- HBASE-12445 - hbase is removing all remaining cells immediately after the cell marked with marker = KeyValue.Type.DeleteColumn via PUT
- HBASE-12837 - ReplicationAdmin leaks zk connections
- HIVE-7647 - Beeline does not honor --headerInterval and --color when executing with "-e"
- HIVE-7733 - Ambiguous column reference error on query
- HIVE-9303 - Parquet files are written with incorrect definition levels
- HIVE-8444 - update pom to junit 4.11
- HIVE-9474 - truncate table changes permissions on the target
- HIVE-9462 - HIVE-8577 - breaks type evolution
- HIVE-9482 - Hive parquet timestamp compatibility
- HIVE-6308 - COLUMNS_V2 Metastore table not populated for tables created without an explicit column list.
- HIVE-9502 - Parquet cannot read Map types from files written with Hive 0.12 or earlier
- HIVE-9445 - Revert HIVE-5700 - enforce single date format for partition column storage
- HIVE-9393 - reduce noisy log level of ColumnarSerDe.java:116 from INFO to DEBUG
- HIVE-7800 - Parquet Column Index Access Schema Size Checking
- HIVE-9330 - DummyTxnManager will throw NPE if WriteEntity writeType has not been set
- HIVE-9265 - Hive with encryption throws NPE to fs path without schema
- HIVE-9199 - Excessive exclusive lock used in some DDLs with DummyTxnManager
- HIVE-6978 - beeline always exits with 0 status, should exit with non-zero status on error
- HUE-2556 - [core] Cannot update project tags of a document
- HUE-2528 - Partitions limit gets capped to 1000 despite configuration
- HUE-2548 - [metastore] Create table then load data does redirect to the table page
- HUE-2525 - [core] Fix manual install of samples
- HUE-2501 - [metastore] Creating a table with header files bigger than 64MB truncates it
- HUE-2484 - [beeswax] Configure support for Hive Server2 LDAP authentication
- HUE-2532 - [search] Fix share URL on Internet Explorer
- HUE-2531 - [impala] Autogrow missing result list
- HUE-2524 - [impala] Sort numerically recent queries tab
- HUE-2495 - [oozie] Improve dashboards sorting mechanism
- HUE-2511 - [impala] Infinite scroll keeps fetching results even if finished
- HUE-2102 - [oozie] Workflow with credentials can't be used with Coordinator
- HUE-2152 - [pig] Credentials support in editor
- OOZIE-2131 - Add flag to sqoop action to skip hbase delegation token generation
- OOZIE-2047 - Oozie does not support Hive tables that use datatypes introduced since Hive 0.8
- OOZIE-2102 - Streaming actions are broken cause of incorrect method signature
- PARQUET-173 - StatisticsFilter doesn't handle And properly
- PARQUET-157 - Divide by zero in logging code
- PARQUET-142 - parquet-tools doesn't filter _SUCCESS file
- PARQUET-124 - parquet.hadoop.ParquetOutputCommitter.commitJob() throws parquet.io.ParquetEncodingException
- PARQUET-136 - NPE thrown in StatisticsFilter when all values in a string/binary column trunk are null
- PARQUET-168 - Wrong command line option description in parquet-tools
- PARQUET-145 - InternalParquetRecordReader.close() should not throw an exception if initialization has failed
- PARQUET-140 - Allow clients to control the GenericData object that is used to read Avro records
- SOLR-7033 - [RecoveryStrategy should not publish any state when closed / cancelled.
- SOLR-5961 - Solr gets crazy on /overseer/queue state change
- SOLR-6640 - Replication can cause index corruption
- SOLR-5875 - QueryComponent.mergeIds() unmarshals all docs' sort field values once per doc instead of once per shard
- SOLR-6919 - Log REST info before executing
- SOLR-6969 - When opening an HDFSTransactionLog for append we must first attempt to recover it's lease to prevent data loss.
- SOLR-5515 - NPE when getting stats on date field with empty result on solrcloud
- SPARK-3778 - newAPIHadoopRDD doesn't properly pass credentials for secure hdfs on yarn
- SPARK-4835 - Streaming saveAs*HadoopFiles() methods may throw FileAlreadyExistsException during checkpoint recovery
- SQOOP-2057 - Skip delegation token generation flag during hbase import
- SQOOP-1779 - Add support for --hive-database when importing Parquet files into Hive
- IMPALA-1622 - Fix overflow in StringParser::StringToFloatInternal()
- IMPALA-1614 - Compute stats fails if table name starts with number
- IMPALA-1623 - unix_timestamp() does not return correct time
- IMPALA-1535 - Partition pruning with NULL
- IMPALA-1606 - Impala does not always give short name to Llama
- IMPALA-1120 - Fetch column statistics using Hive 0.13 bulk API
In addition, CDH 5.3.2 reverts YARN-2713, which has caused problems since its inclusion in CDH 5.3.0.
Want to Get Involved or Learn More?
Check out our other resources
Receive expert Hadoop training through Cloudera University, the industry's only truly dynamic Hadoop training curriculum that’s updated regularly to reflect the state of the art in big data.