Your browser is out of date!

Update your browser to view this website correctly. Update my browser now

×

Please Read and Accept our Terms


Long term component architecture

As the main curator of open standards in Hadoop, Cloudera has a track record of bringing new open source solutions into its platform (such as Apache Spark, Apache HBase, and Apache Parquet) that are eventually adopted by the community at large. As standards, you can build longterm architecture on these components with confidence.

 

PLEASE NOTE:

With the exception of DSSD support, Cloudera Enterprise 5.6.0 is identical to CDH 5.5.2/Cloudera Manager 5.5.3  If you do not need DSSD support, you do not need to upgrade if you are already using the latest 5.5.x release.

 

CDH 5 provides packages for Red-Hat-compatible, SLES, Ubuntu, and Debian systems as described below.

Operating System Version Packages
Red Hat Enterprise Linux (RHEL)-compatible
Red Hat Enterprise Linux 5.7 64-bit
  5.10 64-bit
  6.4 64-bit
  6.5 64-bit
  6.5 in SE Linux mode 64-bit
  6.6 64-bit
CentOS 5.7 64-bit
  5.10 64-bit
  6.4 64-bit
  6.5 64-bit
  6.5 in SE Linux mode 64-bit
  6.6 64-bit
Oracle Linux with default kernel and Unbreakable Enterprise Kernel 5.6 (UEK R2) 64-bit
  6.4 (UEK R2) 64-bit
  6.5 (UEK R2, UEK R3) 64-bit
  6.6 (UEK R3) 64-bit
SLES
SUSE Linux Enterprise Server (SLES) 11 with Service Pack 2 64-bit
SUSE Linux Enterprise Server (SLES) 11 with Service Pack 3 64-bit
Ubuntu/Debian
Ubuntu Precise (12.04) - Long-Term Support (LTS) 64-bit
  Trusty (14.04) - Long-Term Support (LTS) 64-bit
Debian Wheezy (7.0) 64-bit

Note:

  • CDH 5 provides only 64-bit packages.
  • Cloudera has received reports that our RPMs work well on Fedora, but we have not tested this.
  • If you are using an operating system that is not supported by Cloudera packages, you can also download source tarballs from Downloads.
Selected tab: SupportedOperatingSystems
Component MySQL SQLite PostgreSQL Oracle Derby - see Note 4
Oozie 5.5, 5.6 - 8.4, 9.2, 9.3

See Note 2

11gR2 Default
Flume - - - - Default (for the JDBC Channel only)
Hue 5.5, 5.6

See Note 1

Default 8.4, 9.2, 9.3

See Note 2

11gR2 -
Hive/Impala 5.5, 5.6

See Note 1

- 8.4, 9.2, 9.3

See Note 2

11gR2 Default
Sentry 5.5, 5.6

See Note 1

- 8.4, 9.2, 9.3

See Note 2

11gR2 -
Sqoop 1 See Note 3 - See Note 3 See Note 3 -
Sqoop 2 See Note 4 - See Note 4 See Note 4 Default

Note:

  1. MySQL 5.5 is supported on CDH 5.1. MySQL 5.6 is supported on CDH 5.1 and later. The InnoDB storage engine must be enabled in the MySQL server.
  2. PostgreSQL 9.2 is supported on CDH 5.1 and later. PostgreSQL 9.3 is supported on CDH 5.2 and later.
  3. For the purposes of transferring data only, Sqoop 1 supports MySQL 5.0 and above, PostgreSQL 8.4 and above, Oracle 10.2 and above, Teradata 13.10 and above, and Netezza TwinFin 5.0 and above. The Sqoop metastore works only with HSQLDB (1.8.0 and higher 1.x versions; the metastore does not work with any HSQLDB 2.x versions).
  4. Sqoop 2 can transfer data to and from MySQL 5.0 and above, PostgreSQL 8.4 and above, Oracle 10.2 and above, and Microsoft SQL Server 2012 and above. The Sqoop 2 repository database is supported only on Derby and PostgreSQL.
  5. Derby is supported as shown in the table, but not always recommended. See the pages for individual components in the Cloudera Installation and Upgrade guide for recommendations.
Selected tab: SupportedDatabases
CDH 5.4.x is supported with the versions shown in the following table:
Minimum Supported Version Recommended Version Notes
1.7.0_55 1.7.0_67 or 1.7.0_75 None
1.8.0_60 1.8.0_60 None
Selected tab: SupportedJDKVersions

CDH requires IPv4. IPv6 is not supported.

See also Configuring Network Names.

Selected tab: SupportedInternetProtocol
Selected tab: SystemRequirements

Issues Fixed in CDH 5.4.5

 

 

Upstream Issues Fixed

 

The following upstream issues are fixed in CDH 5.4.5:

  • CRUNCH-508 - Improve performance of Scala Enumeration counters in Scrunch
  • CRUNCH-511 - Scrunch product type support should use derived() instead of derivedImmutable()
  • CRUNCH-514 - AvroDerivedDeepCopier should initialize delegate MapFns
  • CRUNCH-516 - Scrunch needs some additional null checks
  • CRUNCH-530 - Fix object reuse bug in GenericRecordToTuple
  • CRUNCH-542 - Wider tolerance for flaky scrunch PCollectionTest
  • FLUME-2215 - ResettableFileInputStream can't support ucs-4 character
  • FLUME-2732 - Make maximum tolerated failures before shutting down and recreating client in AsyncHbaseSink configurable
  • FLUME-2738 - Async HBase sink FD leak on client shutdown
  • FLUME-2749 - Kerberos configuration error when using short names in multiple HDFS Sinks
  • HADOOP-12017 - Hadoop archives command should use configurable replication factor when closing
  • HADOOP-12103 - Small refactoring of DelegationTokenAuthenticationFilter to allow code sharing
  • HADOOP-8151 - Error handling in snappy decompressor throws invalid exceptions
  • HDFS-7501 - TransactionsSinceLastCheckpoint can be negative on SBNs
  • HDFS-7546 - Document, and set an accepting default for dfs.namenode.kerberos.principal.pattern
  • HDFS-7890 - Improve information on Top users for metrics in RollingWindowsManager and lower log level
  • HDFS-7894 - Rolling upgrade readiness is not updated in jmx until query command is issued.
  • HDFS-8072 - Reserved RBW space is not released if client terminates while writing block
  • HDFS-8337 - Accessing httpfs via webhdfs doesn't work from a jar with kerberos
  • HDFS-8656 - Preserve compatibility of ClientProtocol#rollingUpgrade after finalization
  • HDFS-8681 - BlockScanner is incorrectly disabled by default
  • MAPREDUCE-5965 - Hadoop streaming throws error if list of input files is high.
  • YARN-3143 - RM Apps REST API can return NPE or entries missing id and other fields
  • YARN-3453 - Fair Scheduler: Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing
  • YARN-3535 - Scheduler must re-request container resources when RMContainer transitions from ALLOCATED to KILLED
  • YARN-3793 - Several NPEs when deleting local files on NM recovery
  • YARN-3842 - NMProxy should retry on NMNotYetReadyException
  • HBASE-13342 - Fix incorrect interface annotations
  • HBASE-13419 - Thrift gateway should propagate text from exception causes.
  • HBASE-13491 - Fix bug in FuzzyRowFilter#getNextForFuzzyRule
  • HBASE-13851 - RpcClientImpl.close() can hang with cancelled replica RPCs
  • HBASE-13885 - ZK watches leaks during snapshots
  • HBASE-13958 - RESTApiClusterManager calls kill() instead of suspend() and resume()
  • HBASE-13995 - ServerName is not fully case insensitive
  • HBASE-14027 - Clean up netty dependencies
  • HBASE-14045 - Bumping thrift version to 0.9.2.
  • HBASE-14076 - ResultSerialization and MutationSerialization can throw InvalidProtocolBufferException when serializing a cell larger than 64MB
  • HIVE-10252 - Make PPD work for Parquet in row group level
  • HIVE-10270 - Cannot use Decimal constants less than 0.1BD
  • HIVE-10553 - Remove hardcoded Parquet references from SearchArgumentImpl SearchArgumentImpl
  • HIVE-10706 - Make vectorized_timestamp_funcs test more stable
  • HIVE-10801 - 'drop view' fails throwing java.lang.NullPointerException
  • HIVE-10808 - Inner join on Null throwing Cast Exception
  • HIVE-11150 - Remove wrong warning message related to chgrp
  • HIVE-11174 - Hive does not treat floating point signed zeros as equal
  • HIVE-11216 - UDF GenericUDFMapKeys throws NPE when a null map value is passed in
  • HIVE-11401 - Predicate push down does not work with Parquet when partitions are in the expression expression
  • HIVE-6099 - Multi insert does not work properly with distinct count
  • HIVE-9500 - Support nested structs over 24 levels
  • HIVE-9665 - Parallel move task optimization causes race condition
  • HIVE-10427 - collect_list() and collect_set() should accept struct types as argument
  • HIVE-10437 - NullPointerException on queries where map/reduce is not involved on tables with partitions
  • HIVE-10895 - ObjectStore does not close Query objects in some calls, causing a potential leak in some metastore db resources
  • HIVE-10976 - Redundant HiveMetaStore connect check in HS2 CLIService start
  • HIVE-10977 - No need to instantiate MetaStoreDirectSql when HMS DirectSql is disabled
  • HIVE-11095 - Fix SerDeUtils bug when Text is reused
  • HIVE-11100 - Beeline should escape semi-colon in queries
  • HIVE-11112 - ISO-8859-1 text output has fragments of previous longer rows appended
  • HIVE-11157 - Hive.get(HiveConf) returns same Hive object to different user sessions
  • HIVE-11194 - Exchange partition on external tables should fail with error message when target folder already exists
  • HIVE-11433 - NPE for a multiple inner join query
  • HIVE-9767 - Fixes in Hive UDF to be usable in Pig
  • HIVE-10629 - Dropping table in an encrypted zone does not drop warehouse directory
  • HIVE-10630 - Renaming tables across encryption zones renames table even though the operation throws error
  • HIVE-10659 - Beeline command which contains semi-colon as a non-command terminator will fail
  • HIVE-10788 - Change sort_array to support non-primitive types
  • HIVE-11109 - Replication factor is not properly set in SparkHashTableSinkOperator [Spark Branch]
  • HIVE-10594 - Remote Spark client doesn't use Kerberos keytab to authenticate [Spark Branch]
  • HUE-2618 - [hive] Recent query results show character encoding in view
  • HUE-2767 - [impala] Issue showing sample data for a table
  • HUE-2796 - sync_groups_on_login doesn't work with posixGroups
  • HUE-2807 - [useradmin] Support deleting numeric groups
  • HUE-2808 - [dbquery] Add row numbers to support default order by
  • HUE-2813 - [hive] Report when Hue server is down when trying to execute a query
  • HUE-2814 - Revert pyopenssl 0.13.1
  • HUE-2835 - Fixed issue with DN's that have weird comma location
  • HUE-2840 - [useradmin] Fix create home directories for Add/Sync LDAP group
  • HUE-2849 - [useradmin] Fix exception in Add/Sync LDAP group for undefined group name
  • IMPALA-1929 - Avoiding a DCHECK of NULL hash table in spilled right joins
  • IMPALA-2136 - Bug in PrintTColumnValue caused wrong stats for TINYINT partition cols
  • IMPALA-2133 - Properly unescape string value for HBase filters
  • IMPALA-2018 - Where clause does not propagate to joins inside nested views
  • IMPALA-2064 - Add effective_user() builtin
  • IMPALA-2125 - Make UTC to local TimestampValue conversion faster.
  • IMPALA-2048 - Set the correct input format when updating partition metadata.
  • KITE-1014 - Fix support for Hive datasets on Kerberos enabled clusters.
  • KITE-1015 - Add "replaceValues" morphline command that replaces all matching record field values with a given replacement string
  • KITE-462 - Oozie jobs do not pass credentials
  • KITE-976 - DatasetKeyInputFormat/DatasetKeyOutputFormat not setting job configuration before loading dataset
  • KITE-1030 - readCSV WARN log msg on overly long lines where quoteChar is non-empty should print the whole record seen so far
  • OOZIE-2268 - Update ActiveMQ version for security and other fixes
  • OOZIE-2286 - Update Log4j and Log4j-extras to latest 1.2.x release
  • PIG-4053 - PIG-4053: TestMRCompiler succeeded with sun jdk 1.6 while failed with sun jdk 1.7
  • PIG-4338 - PIG-4338: Fix test failures with JDK8
  • PIG-4326 - PIG-4326: AvroStorageSchemaConversionUtilities does not properly convert schema for maps of arrays of records
  • SENTRY-695 - Sentry service should read the hadoop group mapping properties from core-site
  • SENTRY-721 - HDFS Cascading permissions not applied to child file ACLs if a direct grant exists
  • SENTRY-752 - Sentry service audit log file name format should be consistent
  • SOLR-7457 - Make DirectoryFactory publishing MBeanInfo extensible
  • SOLR-7458 - Expose HDFS Block Locality Metrics
  • SPARK-6480 - histogram() bucket function is wrong in some simple edge cases
  • SPARK-6954 - ExecutorAllocationManager can end up requesting a negative number of executors
  • SPARK-7503 - Resources in .sparkStaging directory can't be cleaned up on error
  • SPARK-7705 - Cleanup of .sparkStaging directory fails if application is killed
  • SQOOP-2103 - Not able define Decimal(n,p) data type in map-column-hive option
  • SQOOP-2149 - Update Kite dependency to 1.0.0
  • SQOOP-2252 - Add default to Avro Schema
  • SQOOP-2294 - Change to Avro schema name breaks some use cases
  • SQOOP-2295 - Hive import with Parquet should append automatically
  • SQOOP-2327 - Sqoop2: Change package name from Authorization to authorization
  • SQOOP-2339 - Move sub-directory might fail in append mode
  • SQOOP-2362 - Add oracle direct mode in list of supported databases
  • SQOOP-2400 - hive.metastore.sasl.enabled should be set to true for Oozie integration
  • SQOOP-2406 - Add support for secure mode when importing Parquet files into Hive
  • SQOOP-2437 - Use hive configuration to connect to secure metastore

 

 

Selected tab: WhatsNew

Want to Get Involved or Learn More?

Check out our other resources

Cloudera Community

Collaborate with your peers, industry experts, and Clouderans to make the most of your investment in Hadoop.

Cloudera University

Receive expert Hadoop training through Cloudera University, the industry's only truly dynamic Hadoop training curriculum that’s updated regularly to reflect the state of the art in big data.