Your browser is out of date!

Update your browser to view this website correctly. Update my browser now

×

Please Read and Accept our Terms


Long term component architecture

As the main curator of open standards in Hadoop, Cloudera has a track record of bringing new open source solutions into its platform (such as Apache Spark, Apache HBase, and Apache Parquet) that are eventually adopted by the community at large. As standards, you can build longterm architecture on these components with confidence.

 

PLEASE NOTE:

With the exception of DSSD support, Cloudera Enterprise 5.6.0 is identical to CDH 5.5.2/Cloudera Manager 5.5.3  If you do not need DSSD support, you do not need to upgrade if you are already using the latest 5.5.x release.

 

CDH 5 provides packages for Red-Hat-compatible, SLES, Ubuntu, and Debian systems as described below.

Operating System Version Packages
Red Hat Enterprise Linux (RHEL)-compatible
Red Hat Enterprise Linux 5.7 64-bit
  5.10 64-bit
  6.4 64-bit
  6.5 64-bit
  6.5 in SE Linux mode 64-bit
  6.6 64-bit
CentOS 5.7 64-bit
  5.10 64-bit
  6.4 64-bit
  6.5 64-bit
  6.5 in SE Linux mode 64-bit
  6.6 64-bit
Oracle Linux with default kernel and Unbreakable Enterprise Kernel 5.6 (UEK R2) 64-bit
  6.4 (UEK R2) 64-bit
  6.5 (UEK R2, UEK R3) 64-bit
  6.6 (UEK R3) 64-bit
SLES
SUSE Linux Enterprise Server (SLES) 11 with Service Pack 2 64-bit
SUSE Linux Enterprise Server (SLES) 11 with Service Pack 3 64-bit
Ubuntu/Debian
Ubuntu Precise (12.04) - Long-Term Support (LTS) 64-bit
  Trusty (14.04) - Long-Term Support (LTS) 64-bit
Debian Wheezy (7.0) 64-bit

Note:

  • CDH 5 provides only 64-bit packages.
  • Cloudera has received reports that our RPMs work well on Fedora, but we have not tested this.
  • If you are using an operating system that is not supported by Cloudera packages, you can also download source tarballs from Downloads.
Selected tab: SupportedOperatingSystems
Component MySQL SQLite PostgreSQL Oracle Derby - see Note 4
Oozie 5.5, 5.6 - 8.4, 9.2, 9.3

See Note 2

11gR2 Default
Flume - - - - Default (for the JDBC Channel only)
Hue 5.5, 5.6

See Note 1

Default 8.4, 9.2, 9.3

See Note 2

11gR2 -
Hive/Impala 5.5, 5.6

See Note 1

- 8.4, 9.2, 9.3

See Note 2

11gR2 Default
Sentry 5.5, 5.6

See Note 1

- 8.4, 9.2, 9.3

See Note 2

11gR2 -
Sqoop 1 See Note 3 - See Note 3 See Note 3 -
Sqoop 2 See Note 4 - See Note 4 See Note 4 Default

Note:

  1. MySQL 5.5 is supported on CDH 5.1. MySQL 5.6 is supported on CDH 5.1 and later. The InnoDB storage engine must be enabled in the MySQL server.
  2. PostgreSQL 9.2 is supported on CDH 5.1 and later. PostgreSQL 9.3 is supported on CDH 5.2 and later.
  3. For the purposes of transferring data only, Sqoop 1 supports MySQL 5.0 and above, PostgreSQL 8.4 and above, Oracle 10.2 and above, Teradata 13.10 and above, and Netezza TwinFin 5.0 and above. The Sqoop metastore works only with HSQLDB (1.8.0 and higher 1.x versions; the metastore does not work with any HSQLDB 2.x versions).
  4. Sqoop 2 can transfer data to and from MySQL 5.0 and above, PostgreSQL 8.4 and above, Oracle 10.2 and above, and Microsoft SQL Server 2012 and above. The Sqoop 2 repository database is supported only on Derby and PostgreSQL.
  5. Derby is supported as shown in the table, but not always recommended. See the pages for individual components in the Cloudera Installation and Upgrade guide for recommendations.
Selected tab: SupportedDatabases
CDH 5.4.x is supported with the versions shown in the following table:
Minimum Supported Version Recommended Version Notes
1.7.0_55 1.7.0_67 or 1.7.0_75 None
1.8.0_60 1.8.0_60 None
Selected tab: SupportedJDKVersions

CDH requires IPv4. IPv6 is not supported.

See also Configuring Network Names.

Selected tab: SupportedInternetProtocol
Selected tab: SystemRequirements

 

Issues Fixed in CDH 5.4.7

Upstream Issues Fixed

The following upstream issues are fixed in CDH 5.4.7:

  • CRUNCH-525 - Correct (more) accurate default scale factors for built-in MapFn implementations
  • CRUNCH-527 - Use hash smearing for partitioning
  • CRUNCH-528 - Improve Pair comparison
  • CRUNCH-531 - Fix split graph rendering typo.
  • CRUNCH-535 - call initCredentials on the job
  • CRUNCH-536 - Refactor CrunchControlledJob.Hook interface and make it client-accessible
  • CRUNCH-539 - Fix reading WritableComparables bimap
  • CRUNCH-540 - Make AvroReflectDeepCopier serializable
  • CRUNCH-543 - Have AvroPathPerKeyTarget handle child directories properly
  • CRUNCH-544 - Improve performance/serializability of materialized toMap.
  • CRUNCH-546 - Remove calls to CellUtil.cloneXXX
  • CRUNCH-547 - Properly handle nullability for Avro union types
  • CRUNCH-548 - Have the AvroReflectDeepCopier use the class of the source object when constructing new instances instead of the target class
  • CRUNCH-551 - Make the use of Configuration objects consistent in CrunchInputSplit and CrunchRecordReader
  • CRUNCH-553 - Fix record drop issue that can occur w/From.formattedFile TableSources
  • FLUME-1934 - Spooling Directory Source dies on encountering zero-byte files.
  • FLUME-2753 - Error when specifying empty replace string in Search and Replace Interceptor
  • HADOOP-12317 - Applications fail on NM restart on some linux distro because NM container recovery declares AM container as LOST
  • HDFS-8806 - Inconsistent metrics: number of missing blocks with replication factor 1 not properly cleared
  • HDFS-8850 - VolumeScanner thread exits with exception if there is no block pool to be scanned but there are suspicious blocks.
  • MAPREDUCE-5817 - Mappers get rescheduled on node transition even after all reducers are completed.
  • MAPREDUCE-6277 - Job can post multiple history files if attempt loses connection to the RM
  • MAPREDUCE-6439 - AM may fail instead of retrying if RM shuts down during the allocate call.
  • YARN-2921 - Fix MockRM/MockAM#waitForState sleep too long.
  • YARN-3823 - Fix mismatch in default values for yarn.scheduler.maximum-allocation-vcores property
  • YARN-3990 - AsyncDispatcher may overloaded with RMAppNodeUpdateEvent when Node is connected/disconnected
  • HBASE-13329 - ArrayIndexOutOfBoundsException in CellComparator#getMinimumMidpointArray.
  • HBASE-13437 - ThriftServer leaks ZooKeeper connections
  • HBASE-13471 - Fix a possible infinite loop in doMiniBatchMutation
  • HBASE-13684 - Allow mlockagent to be used when not starting as root
  • HBASE-14162 - Fixing maven target for regenerating thrift classes fails against 0.9.2
  • HBASE-14354 - Minor improvements for usage of the mlock agent
  • HIVE-7476 - CTAS does not work properly for s3
  • HIVE-9327 - CBO (Calcite Return Path): Removing Row Resolvers from ParseContext
  • HIVE-9512 - HIVE-9327 causing regression in stats annotation
  • HIVE-9580 - Server returns incorrect result from JOIN ON VARCHAR columns
  • HIVE-9613 - Left join query plan outputs wrong column when using subquery
  • HIVE-10085 - Lateral view on top of a view throws RuntimeException
  • HIVE-10140 - Window boundary is not compared correctly
  • HIVE-10288 - Cannot call permanent UDFs
  • HIVE-10319 - Hive CLI startup takes a long time with a large number of databases
  • HIVE-10719 - Hive metastore failure when alter table rename is attempted.
  • HIVE-10875 - Select query with view in subquery adds underlying table as direct input
  • HIVE-10906 - Value based UDAF function without orderby expression throws NPE
  • HIVE-10911 - Add support for date datatype in the value based windowing function
  • HIVE-10972 - DummyTxnManager always locks the current database in shared mode, which is incorrect
  • HIVE-10985 - Value based windowing on timestamp and double can't handle NULL value
  • HIVE-10996 - Aggregation / Projection over Multi-Join Inner Query producing incorrect results
  • HIVE-11139 - PROPOSEDQTest combine2_hadoop20.q fails when using -Phadoop-1 profile due to
  • HIVE-11172 - Vectorization wrong results for aggregate query with where clause without group by
  • HIVE-11203 - Beeline force option doesn't force execution when errors occurred in a script.
  • HIVE-11250 - Change in spark.executor.instances (and others) doesn't take effect after RSC is launched for HS2
  • HIVE-11255 - get_table_objects_by_name() in HiveMetaStore.java needs to retrieve table objects in multiple batches
  • HIVE-11258 - The function drop_database_core() of HiveMetaStore.java may not drop all the tables
  • HIVE-11271 - java.lang.IndexOutOfBoundsException when union all with if function
  • HIVE-11288 - Avro SerDe InstanceCache returns incorrect schema
  • HIVE-11333 - ColumnPruner prunes columns of UnionOperator that should be kept
  • HIVE-11502 - Map side aggregation is extremely slow
  • HIVE-11604 - HIVE return wrong results in some queries with PTF function
  • HIVE-11620 - Fix several qtest output order
  • HUE-2664 - [jobbrowser] Fix fetching logs from job history server
  • HUE-2873 - [oozie] Handle TransactionManagementError on workflow dashboard
  • HUE-2877 - [desktop] Add pyasn1 and ndg_httpsclient to support SSL Server Name Indication
  • HUE-2880 - [hadoop] Fix uploading large files to a kerberized HTTPFS
  • HUE-2882 - [oozie] Fix parsing error when workflow job uses Australian timezone
  • HUE-2883 - [impala] Canceling a query shows an error message
  • HUE-2885 - [oozie] Java options java-opts not generated correctly in XML
  • HUE-2893 - [desktop] Backport CherryPy SSL file upload fix
  • HUE-2903 - [oozie] Fix error with Workflow parameter on rerun
  • IMPALA-1737 - Substitute an InsertStmt's partition key exprs with the root node's smap.
  • IMPALA-1756 - Constant filter expressions are not checked for errors and state cleanup is not done before throwing exception.
  • IMPALA-1898 - Explicit aliases + ordinals analysis bug
  • IMPALA-1983 - Warn if table stats are potentially corrupt.
  • IMPALA-1987 - Fix TupleIsNullPredicate to return false if no tuples are nullable.
  • IMPALA-2088 - Fix planning of empty union operands with analytics.
  • IMPALA-2089 - Retain eq predicates bound by grouping slots with complex grouping exprs.
  • IMPALA-2178 - fix Expr::ComputeResultsLayout() logic.
  • IMPALA-2199 - Row count not set for empty partition when spec is used with compute incremental stats
  • IMPALA-2201 - Unconditionally update the partition stats and row count.
  • IMPALA-2203 - Set an InsertStmt's result exprs from the source statement's result exprs.
  • IMPALA-2216 - Set the output smap of an EmptySetNode produced from an empty inline view.
  • IMPALA-2239 - update misc.test to match the new .test file format.
  • IMPALA-2266 - Pass correct child node in 2nd phase merge aggregation.
  • KITE-1053 - Fix int overflow bug in FS writer.
  • SENTRY-810 - CTAS without location is not verified properly
  • SOLR-7135 - Allow the server build.xml 'sync-hack' target to by skipped by specifying a system property.
  • SOLR-7999 - SolrRequetParserTest#testStreamURL started failing.
  • SPARK-8606 - Prevent exceptions in RDD.getPreferredLocations() from crashing DAGScheduler
  • ZOOKEEPER-442 - need a way to remove watches that are no longer of interest

 

Selected tab: WhatsNew

Want to Get Involved or Learn More?

Check out our other resources

Cloudera Community

Collaborate with your peers, industry experts, and Clouderans to make the most of your investment in Hadoop.

Cloudera University

Receive expert Hadoop training through Cloudera University, the industry's only truly dynamic Hadoop training curriculum that’s updated regularly to reflect the state of the art in big data.