Your browser is out of date!

Update your browser to view this website correctly. Update my browser now

×

Please Read and Accept our Terms


Long term component architecture

As the main curator of open standards in Hadoop, Cloudera has a track record of bringing new open source solutions into its platform (such as Apache Spark, Apache HBase, and Apache Parquet) that are eventually adopted by the community at large. As standards, you can build longterm architecture on these components with confidence.

 

PLEASE NOTE:

With the exception of DSSD support, Cloudera Enterprise 5.6.0 is identical to CDH 5.5.2/Cloudera Manager 5.5.3  If you do not need DSSD support, you do not need to upgrade if you are already using the latest 5.5.x release.

 

CDH 5 provides packages for Red-Hat-compatible, SLES, Ubuntu, and Debian systems as described below.

Operating System Version Packages
Red Hat Enterprise Linux (RHEL)-compatible
Red Hat Enterprise Linux 5.7 64-bit
  5.10 64-bit
  6.4 64-bit
  6.5 64-bit
  6.5 in SE Linux mode 64-bit
  6.6 64-bit
CentOS 5.7 64-bit
  5.10 64-bit
  6.4 64-bit
  6.5 64-bit
  6.5 in SE Linux mode 64-bit
  6.6 64-bit
Oracle Linux with default kernel and Unbreakable Enterprise Kernel 5.6 (UEK R2) 64-bit
  6.4 (UEK R2) 64-bit
  6.5 (UEK R2, UEK R3) 64-bit
  6.6 (UEK R3) 64-bit
SLES
SUSE Linux Enterprise Server (SLES) 11 with Service Pack 2 64-bit
SUSE Linux Enterprise Server (SLES) 11 with Service Pack 3 64-bit
Ubuntu/Debian
Ubuntu Precise (12.04) - Long-Term Support (LTS) 64-bit
  Trusty (14.04) - Long-Term Support (LTS) 64-bit
Debian Wheezy (7.0) 64-bit

Note:

  • CDH 5 provides only 64-bit packages.
  • Cloudera has received reports that our RPMs work well on Fedora, but we have not tested this.
  • If you are using an operating system that is not supported by Cloudera packages, you can also download source tarballs from Downloads.
Selected tab: SupportedOperatingSystems
Component MySQL SQLite PostgreSQL Oracle Derby - see Note 4
Oozie 5.5, 5.6 8.4, 9.2, 9.3

See Note 2

11gR2 Default
Flume Default (for the JDBC Channel only)
Hue 5.5, 5.6

See Note 1

Default 8.4, 9.2, 9.3

See Note 2

11gR2
Hive/Impala 5.5, 5.6

See Note 1

8.4, 9.2, 9.3

See Note 2

11gR2 Default
Sentry 5.5, 5.6

See Note 1

8.4, 9.2, 9.3

See Note 2

11gR2
Sqoop 1 See Note 3 See Note 3 See Note 3
Sqoop 2 See Note 4 See Note 4 See Note 4 Default

Note:

  1. MySQL 5.5 is supported on CDH 5.1. MySQL 5.6 is supported on CDH 5.1 and later. The InnoDB storage engine must be enabled in the MySQL server.
  2. PostgreSQL 9.2 is supported on CDH 5.1 and later. PostgreSQL 9.3 is supported on CDH 5.2 and later.
  3. For the purposes of transferring data only, Sqoop 1 supports MySQL 5.0 and above, PostgreSQL 8.4 and above, Oracle 10.2 and above, Teradata 13.10 and above, and Netezza TwinFin 5.0 and above. The Sqoop metastore works only with HSQLDB (1.8.0 and higher 1.x versions; the metastore does not work with any HSQLDB 2.x versions).
  4. Sqoop 2 can transfer data to and from MySQL 5.0 and above, PostgreSQL 8.4 and above, Oracle 10.2 and above, and Microsoft SQL Server 2012 and above. The Sqoop 2 repository database is supported only on Derby and PostgreSQL.
  5. Derby is supported as shown in the table, but not always recommended. See the pages for individual components in the Cloudera Installation and Upgrade guide for recommendations.
Selected tab: SupportedDatabases
CDH 5.4.x is supported with the versions shown in the following table:
Minimum Supported Version Recommended Version Notes
1.7.0_55 1.7.0_67 or 1.7.0_75 None
1.8.0_60 1.8.0_60 None
Selected tab: SupportedJDKVersions

CDH requires IPv4. IPv6 is not supported.

See also Configuring Network Names.

Selected tab: SupportedInternetProtocol
Selected tab: SystemRequirements

Issues Fixed in CDH 5.4.11

  • FLUME-2891 - Revert FLUME-2712 and FLUME-2886
  • FLUME-2908 - NetcatSource - SocketChannel not closed when session is broken
  • HADOOP-8436 - NPE In getLocalPathForWrite ( path, conf ) when the required context item is not configured
  • HADOOP-8437 - getLocalPathForWrite should throw IOException for invalid paths
  • HADOOP-8751 - NPE in Token.toString() when Token is constructed using null identifier
  • HADOOP-8934 - Shell command ls should include sort options
  • HADOOP-10048 - LocalDirAllocator should avoid holding locks while accessing the filesystem
  • HADOOP-10971 - Add -C flag to make `hadoop fs -ls` print filenames only
  • HADOOP-11901 - BytesWritable fails to support 2G chunks due to integer overflow
  • HADOOP-12252 - LocalDirAllocator should not throw NPE with empty string configuration
  • HADOOP-12269 - Update aws-sdk dependency to 1.10.6
  • HADOOP-12787 - KMS SPNEGO sequence does not work with WEBHDFS
  • HADOOP-12841 - Update s3-related properties in core-default.xml.
  • HADOOP-12901 - Add warning log when KMSClientProvider cannot create a connection to the KMS server.
  • HADOOP-12972 - Lz4Compressor#getLibraryName returns the wrong version number
  • HADOOP-13079 - Add -q option to Ls to print ? instead of non-printable characters
  • HADOOP-13132 - Handle ClassCastException on AuthenticationException in LoadBalancingKMSClientProvider
  • HADOOP-13155 - Implement TokenRenewer to renew and cancel delegation tokens in KMS
  • HADOOP-13251 - Authenticate with Kerberos credentials when renewing KMS delegation token
  • HADOOP-13255 - KMSClientProvider should check and renew tgt when doing delegation token operations
  • HADOOP-13263 - Reload cached groups in background after expiry.
  • HADOOP-13457 - Remove hardcoded absolute path for shell executable.
  • HDFS-4660 - Block corruption can happen during pipeline recovery
  • HDFS-8211 - DataNode UUID is always null in the JMX counter.
  • HDFS-8451 - DFSClient probe for encryption testing interprets empty URI property for enabled
  • HDFS-8496 - Calling stopWriter() with FSDatasetImpl lock held may block other threads
  • HDFS-8576 - Lease recovery should return true if the lease can be released and the file can be closed
  • HDFS-8722 - Optimize datanode writes for small writes and flushes
  • HDFS-9085 - Show renewer information in DelegationTokenIdentifier#toString
  • HDFS-9220 - Reading small file (< 512 bytes) that is open for append fails due to incorrect checksum
  • HDFS-9276 - Failed to Update HDFS Delegation Token for long running application in HA mode
  • HDFS-9466 - TestShortCircuitCache#testDataXceiverCleansUpSlotsOnFailure is flaky
  • HDFS-9589 - Block files which have been hardlinked should be duplicated before the DataNode appends to the them
  • HDFS-9700 - DFSClient and DFSOutputStream should set TCP_NODELAY on sockets for DataTransferProtocol
  • HDFS-9732 - Improve DelegationTokenIdentifier.toString() for better logging
  • HDFS-9939 - Increase DecompressorStream skip buffer size
  • HDFS-9949 - Add a test case to ensure that the DataNode does not regenerate its UUID when a storage directory is cleared
  • HDFS-10267 - Extra "synchronized" on FsDatasetImpl#recoverAppend and FsDatasetImpl#recoverClose
  • HDFS-10360 - DataNode might format directory and lose blocks if current/VERSION is missing.
  • HDFS-10381 - , DataStreamer DataNode exclusion log message should be warning.
  • MAPREDUCE-4785 - TestMRApp occasionally fails
  • MAPREDUCE-6580 - Test failure: TestMRJobsWithProfiler
  • YARN-2871 - TestRMRestart#testRMRestartGetApplicationList sometimes fails in trunk
  • YARN-3727 - Check if the directory exists before using it for localization
  • YARN-4168 - Fixed a failing test TestLogAggregationService.testLocalFileDeletionOnDiskFull
  • YARN-4354 - Public resource localization fails with NPE
  • YARN-4717 - TestResourceLocalizationService.testPublicResourceInitializesLocalDir fails Intermittently due to IllegalArgumentException from cleanup
  • YARN-5048 - DelegationTokenRenewer#skipTokenRenewal might throw NPE
  • HBASE-6617 - ReplicationSourceManager should be able to track multiple WAL paths (ADDENDUM)
  • HBASE-11625 - Verifies data before building HFileBlock. - Adds HFileBlock.Header class which contains information about location of fields. Testing: Adds CorruptedFSReaderImpl to TestChecksum.
  • HBASE-11927 - Use Native Hadoop Library for HFile checksum.
  • HBASE-14155 - StackOverflowError in reverse scan
  • HBASE-14359 - HTable#close will hang forever if unchecked error/exception thrown in AsyncProcess#sendMultiAction
  • HBASE-14730 - region server needs to log warnings when there are attributes configured for cells with hfile v2
  • HBASE-14759 - Avoid using Math.abs when selecting SyncRunner in FSHLog
  • HBASE-15234 - Don't abort ReplicationLogCleaner on ZooKeeper errors
  • HBASE-15456 - CreateTableProcedure/ModifyTableProcedure needs to fail when there is no family in table descriptor
  • HBASE-15479 - No more garbage or beware of autoboxing
  • HBASE-15582 - SnapshotManifestV1 too verbose when there are no regions
  • HBASE-15707 - ImportTSV bulk output does not support tags with hfile.format.version=3
  • HBASE-15746 - Remove extra RegionCoprocessor preClose() in RSRpcServices#closeRegion
  • HBASE-15811 - Batch Get after batch Put does not fetch all Cells We were not waiting on all executors in a batch to complete. The test for no-more-executors was damaged by the 0.99/0.98.4 fix "HBASE-11403 Fix race conditions around Object#notify"
  • HBASE-15925 - provide default values for hadoop compat module related properties that match default hadoop profile.
  • HBASE-16207 - can't restore snapshot without "Admin" permission
  • HIVE-9499 - hive.limit.query.max.table.partition makes queries fail on non-partitioned tables
  • HIVE-10048 - JDBC - Support SSL encryption regardless of Authentication mechanism
  • HIVE-10303 - HIVE-9471 broke forward compatibility of ORC files
  • HIVE-10685 - Alter table concatenate oparetor will cause duplicate data
  • HIVE-10925 - Non-static threadlocals in metastore code can potentially cause memory leak
  • HIVE-11031 - ORC concatenation of old files can fail while merging column statistics
  • HIVE-11054 - Handle varchar/char partition columns in vectorization
  • HIVE-11243 - Changing log level in Utilities.getBaseWork
  • HIVE-11408 - HiveServer2 is leaking ClassLoaders when add jar / temporary functions are used due to constructor caching in Hadoop ReflectionUtils
  • HIVE-11427 - Location of temporary table for CREATE TABLE SELECT broken by HIVE-7079.
  • HIVE-11488 - Combine the following jiras for "Support sessionId and queryId logging"Add sessionId and queryId info to HS2 log (Aihua Xu, reviewed by Szehon Ho) HIVE-12456: QueryId can't be stored in the configuration of the SessionState since multiple queries can run in a single session
  • HIVE-11583 - When PTF is used over a large partitions result could be corrupted
  • HIVE-11747 - Unnecessary error log is shown when executing a "INSERT OVERWRITE LOCAL DIRECTORY" cmd in the embedded mode
  • HIVE-11827 - STORED AS AVRO fails SELECT COUNT(*) when empty
  • HIVE-11919 - Hive Union Type Mismatch
  • HIVE-12354 - MapJoin with double keys is slow on MR
  • HIVE-12431 - Support timeout for compile lock
  • HIVE-12481 - Occasionally "Request is a replay" will be thrown from HS2
  • HIVE-12635 - Hive should return the latest hbase cell timestamp as the row timestamp value
  • HIVE-12958 - Make embedded Jetty server more configurable
  • HIVE-13200 - Aggregation functions returning empty rows on partitioned columns
  • HIVE-13251 - hive can't read the decimal in AVRO file generated from previous version
  • HIVE-13285 - Orc concatenation may drop old files from moving to final path
  • HIVE-13286 - Query ID is being reused across queries
  • HIVE-13462 - HiveResultSetMetaData.getPrecision() fails for NULL columns
  • HIVE-13527 - Using deprecated APIs in HBase client causes zookeeper connection leaks
  • HIVE-13570 - Some queries with Union all fail when CBO is off
  • HIVE-13932 - Hive SMB Map Join with small set of LIMIT failed with NPE
  • HIVE-13953 - Issues in HiveLockObject equals method
  • HIVE-13991 - Union All on view fail with no valid permission on underneath table
  • HIVE-14118 - Make the alter partition exception more meaningful
  • HUE-3185 - [oozie] Avoid extra API calls for parent information in workflow dashboard
  • HUE-3185 - Revert "[oozie] Avoid extra API calls for parent information in workflow dashboard"
  • HUE-3185 - [oozie] Avoid extra API calls for parent information in workflow dashboard
  • HUE-3437 - [core] PamBackend does not honor ignore_username_case
  • IMPALA-2378 - check proc mem limit before preparing fragment
  • IMPALA-2612 - Free local allocations once for every row batch when building hash tables.
  • IMPALA-2711 - Fix memory leak in Rand().
  • IMPALA-2722 - Free local allocations per row batch in non-partitioned AGG and HJ
  • OOZIE-2429 - TestEventGeneration test is unreliable
  • OOZIE-2466 - Repeated failure of TestMetricsInstrumentation.testSamplers
  • OOZIE-2486 - TestSLAEventsGetForFilterJPAExecutor is unreliable
  • SENTRY-780 - HDFS Plugin should not execute path callbacks for views
  • SENTRY-1184 - Clean up HMSPaths.renameAuthzObject
  • SENTRY-1292 - Reorder DBModelAction EnumSet
  • SENTRY-1293 - Avoid converting string permission to Privilege object
  • SOLR-6631 - DistributedQueue spinning on calling zookeeper getChildren()
  • SOLR-6820 - Make the number of version buckets used by the UpdateLog configurable as increasing beyond the default 256 has been shown to help with high volume indexing performance in SolrCloudIncrease the default number of buckets to 65536 instead of 256fix numVersionBuckets name attribute in configsets
  • SOLR-7332 - Initialize the highest value for all version buckets with the max value from the index or recent updates to avoid unnecessary lookups to the index to check for reordered updates when processing new documents.
  • SOLR-7587 - TestSpellCheckResponse stalled and never timed out
  • SOLR-7625 - Version bucket seed not updated after new index is installed on a replica
  • SOLR-8152 - Overseer Task Processor/Queue can miss responses, leading to timeouts
  • SOLR-8451 - Fix backport
  • SOLR-8451 - We should not call method.abort in HttpSolrClient or HttpSolrCall#remoteQuery and HttpSolrCall#remoteQuery should not close streams.
  • SOLR-8453 - Solr should attempt to consume the request inputstream on errors as we cannot count on the container to do it.
  • SOLR-8578 - Successful or not, requests are not always fully consumed by Solrj clients and we count on HttpClient or the JVM.
  • SOLR-8633 - DistributedUpdateProcess processCommit/deleteByQuery calls finish on DUP and SolrCmdDistributor, which violates the lifecycle and can cause bugs.
  • SOLR-8683 - Tune down stream closed logging
  • SOLR-8683 - Always consume the full request on the server, not just in the case of an error.
  • SOLR-8855 - The HDFS BlockDirectory should not clean up its cache on shutdown.
  • SOLR-8856 - Do not cache merge or 'read once' contexts in the hdfs block cache.
  • SOLR-8857 - HdfsUpdateLog does not use configured or new default number of version buckets and is hard coded to 256.
  • SOLR-8869 - Optionally disable printing field cache entries in SolrFieldCacheMBean
  • SPARK-12087 - Create new JobConf for every batch in saveAsHadoopFiles

Selected tab: WhatsNew

Want to Get Involved or Learn More?

Check out our other resources

Cloudera Community

Collaborate with your peers, industry experts, and Clouderans to make the most of your investment in Hadoop.

Cloudera University

Receive expert Hadoop training through Cloudera University, the industry's only truly dynamic Hadoop training curriculum that’s updated regularly to reflect the state of the art in big data.