Download CDH 5.8.5
Your browser is out of date

Update your browser to view this website correctly. Update my browser now


Please Read and Accept our Terms

Long term component architecture

As the main curator of open standards in Hadoop, Cloudera has a track record of bringing new open source solutions into its platform (such as Apache Spark, Apache HBase, and Apache Parquet) that are eventually adopted by the community at large. As standards, you can build longterm architecture on these components with confidence.



With the exception of DSSD support, Cloudera Enterprise 5.6.0 is identical to CDH 5.5.2/Cloudera Manager 5.5.3  If you do not need DSSD support, you do not need to upgrade if you are already using the latest 5.5.x release.


Note: All CDH and Cloudera Manager hosts that make up a logical cluster need to run on the same major OS release to be covered by Cloudera Support.

CDH 5 provides 64-bit packages for RHEL-compatible, SLES, Ubuntu, and Debian systems as listed below.


Operating System Version Packages
Red Hat Enterprise Linux (RHEL)-compatible
RHEL (+ SELinux mode in available versions) 5.7 64-bit
  5.10 64-bit
  6.4 64-bit
  6.5 64-bit
  6.6 64-bit
  6.7 64-bit
  7.1 64-bit
  7.2 64-bit
CentOS (+ SELinux mode in available versions) 5.7 64-bit
  5.10 64-bit
  6.4 64-bit
  6.5 64-bit
  6.6 64-bit
  6.7 64-bit
  7.1 64-bit
  7.2 64-bit
Oracle Enterprise Linux (OEL) with Unbreakable Enterprise Kernel (UEK) 5.7 (UEK R2) 64-bit
  5.10 64-bit
  5.11 64-bit
  6.4 (UEK R2) 64-bit
  6.5 (UEK R2, UEK R3) 64-bit
  6.6 (UEK R3) 64-bit
  6.7 (UEK R3) 64-bit
  7.1 64-bit
  7.2 64-bit
SUSE Linux Enterprise Server (SLES) 11 with Service Pack 2 64-bit
SUSE Linux Enterprise Server (SLES) 11 with Service Pack 3 64-bit
SUSE Linux Enterprise Server (SLES) 11 with Service Pack 4 64-bit
Ubuntu Precise 12.04 - Long-Term Support (LTS) 64-bit
  Trusty 14.04 - Long-Term Support (LTS) 64-bit
Debian Wheezy 7.0, 7.1, and 7.8 64-bit


Important: Cloudera supports RHEL 7 with the following limitations:



  • Cloudera Enterprise is supported on platforms with Security-Enhanced Linux (SELinux) enabled. Cloudera is not responsible for policy support nor policy enforcement. If you experience issues with SELinux, contact your OS provider.
  • CDH 5.8 DataNode hosts with EMC® DSSD™ D5™ are supported by RHEL 6.6, 7.1, and 7.2.
Selected tab: supportedoperatingsystems
Component MariaDB MySQL SQLite PostgreSQL Oracle Derby - see Note 5
Oozie 5.5 5.1, 5.5, 5.6, 5.7 8.1, 8.3, 8.4, 9.1, 9.2, 9.3, 9.4

See Note 3

11gR2, 12c Default
Flume Default (for the JDBC Channel only)
Hue 5.5 5.1, 5.5, 5.6, 5.7

See Note 6

Default 8.1, 8.3, 8.4, 9.1, 9.2, 9.3, 9.4

See Note 3

11gR2, 12c
Hive/Impala 5.5 5.1, 5.5, 5.6, 5.7

See Note 1

8.1, 8.3, 8.4, 9.1, 9.2, 9.3, 9.4

See Note 3

11gR2, 12c Default
Sentry 5.5 5.1, 5.5, 5.6, 5.7

See Note 1

8.1, 8.3, 8.4, 9.1, 9.2, 9.3, 9.4

See Note 3

11gR2, 12c
Sqoop 1 5.5 See Note 4 See Note 4 See Note 4
Sqoop 2 5.5 Default




  1. MySQL 5.5 is supported on CDH 5.1. MySQL 5.6 is supported on CDH 5.1 and higher. The InnoDB storage engine must be enabled in the MySQL server.
  2. Cloudera Manager installation fails if GTID-based replication is enabled in MySQL.
  3. PostgreSQL 9.2 is supported on CDH 5.1 and higher. PostgreSQL 9.3 is supported on CDH 5.2 and higher. PostgreSQL 9.4 is supported on CDH 5.5 and higher.
  4. For purposes of transferring data only, Sqoop 1 supports MySQL 5.0 and above, PostgreSQL 8.4 and above, Oracle 10.2 and above, Teradata 13.10 and above, and Netezza TwinFin 5.0 and above. The Sqoop metastore works only with HSQLDB (1.8.0 and higher 1.x versions; the metastore does not work with any HSQLDB 2.x versions).
  5. Derby is supported as shown in the table, but not always recommended. See the pages for individual components in the Cloudera Installation and Upgrade guide for recommendations.
  6. CDH 5 Hue requires the default MySQL version of the operating system on which it is being installed, which is usually MySQL 5.1, 5.5, or 5.6.
Selected tab: supporteddatabases

CDH and Cloudera Manager Supported JDK Versions

Only 64 bit JDKs from Oracle are supported. Oracle JDK 7 is supported across all versions of Cloudera Manager 5 and CDH 5. Oracle JDK 8 is supported in C5.3.x and higher.


A supported minor JDK release will remain supported throughout a Cloudera major release lifecycle, from the time of its addition forward, unless specifically excluded.


Warning: JDK 1.8u40 and JDK 1.8u60 are excluded from support. Also, the Oozie Web Console returns 500 error when Oozie server runs on JDK 8u75 or higher.


Running CDH nodes within the same cluster on different JDK releases is not supported. JDK release across a cluster needs to match the patch level.

  • All nodes in your cluster must run the same Oracle JDK version.
  • All services must be deployed on the same Oracle JDK version.


The Cloudera Manager repository is packaged with Oracle JDK 1.7.0_67 (for example) and can be automatically installed during a new installation or an upgrade.


For a full list of supported JDK Versions please see CDH and Cloudera Manager Supported JDK Versions.

Selected tab: supportedjdkversions


Hue works with the two most recent versions of the following browsers. Cookies and JavaScript must be on.

  • Chrome
  • Firefox
  • Safari (not supported on Windows)
  • Internet Explorer

Hue could display in older versions and even other browsers, but you might not have access to all of its features.

Selected tab: supportedbrowsers

CDH requires IPv4. IPv6 is not supported.

See also Configuring Network Names.


Multihoming CDH or Cloudera Manager is not supported outside specifically certified Cloudera partner appliances. Cloudera finds that current Hadoop architectures combined with modern network infrastructures and security practices remove the need for multihoming. Multihoming, however, is beneficial internally in appliance form factors to take advantage of high-bandwidth InfiniBand interconnects.


Although some subareas of the product may work with unsupported custom multihoming configurations, there are known issues with multihoming. In addition, unknown issues may arise because multihoming is not covered by our test matrix outside the Cloudera-certified partner appliances.

Selected tab: supportedinternetprotocol

The following components are supported by the indicated versions of Transport Layer Security (TLS):


Table 1. Components Supported by TLS


Role Name Port Version
Flume   Avro Source/Sink   TLS 1.2
Flume   Flume HTTP Source/Sink   TLS 1.2
HBase Master HBase Master Web UI Port 60010 TLS 1.2
HDFS NameNode Secure NameNode Web UI Port 50470 TLS 1.2
HDFS Secondary NameNode Secure Secondary NameNode Web UI Port 50495 TLS 1.2
HDFS HttpFS REST Port 14000 TLS 1.1, TLS 1.2
Hive HiveServer2 HiveServer2 Port 10000 TLS 1.2
Hue Hue Server Hue HTTP Port 8888 TLS 1.2
Cloudera Impala Impala Daemon Impala Daemon Beeswax Port 21000 TLS 1.2
Cloudera Impala Impala Daemon Impala Daemon HiveServer2 Port 21050 TLS 1.2
Cloudera Impala Impala Daemon Impala Daemon Backend Port 22000 TLS 1.2
Cloudera Impala Impala Daemon Impala Daemon HTTP Server Port 25000 TLS 1.2
Cloudera Impala Impala StateStore StateStore Service Port 24000 TLS 1.2
Cloudera Impala Impala StateStore StateStore HTTP Server Port 25010 TLS 1.2
Cloudera Impala Impala Catalog Server Catalog Server HTTP Server Port 25020 TLS 1.2
Cloudera Impala Impala Catalog Server Catalog Server Service Port 26000 TLS 1.2
Oozie Oozie Server Oozie HTTPS Port 11443 TLS 1.1, TLS 1.2
Solr Solr Server Solr HTTP Port 8983 TLS 1.1, TLS 1.2
Solr Solr Server Solr HTTPS Port 8985 TLS 1.1, TLS 1.2
YARN ResourceManager ResourceManager Web Application HTTP Port 8090 TLS 1.2
YARN JobHistory Server MRv1 JobHistory Web Application HTTP Port 19890 TLS 1.2

Selected tab: supportedtransportlayersecurityversions
Selected tab: systemrequirements

Upstream Issues Fixed

The following upstream issues are fixed in CDH 5.8.5:

  • CRUNCH-592 - Job fails for null ByteBuffer value in Avro tables.
  • FLUME-1899 - Make SpoolDir work with subdirectories
  • FLUME-2171 - Add Interceptor to remove headers from event
  • FLUME-2652 - Documented transaction handling semantics incorrect in developer guide.
  • FLUME-2797 - Use SourceCounter for SyslogTcpSource
  • FLUME-2798 - Malformed Syslog messages can lead to OutOfMemoryException
  • FLUME-2812 - Fix semaphore leak causing java.lang.Error: Maximum permit count exceeded in MemoryChannel
  • FLUME-2844 - SpillableMemoryChannel must start ChannelCounter
  • FLUME-2889 - Fixes to DateTime computations
  • FLUME-2901 - Document Kerberos setup for Kafka channel
  • FLUME-2910 - AsyncHBaseSink: Failure callbacks should log the exception that caused them
  • FLUME-2913 - Don't strip SLF4J from imported classpaths
  • FLUME-2918 - Speed up TaildirSource on directories with many files
  • FLUME-2922 - Sync SequenceFile.Writer before calling hflush
  • FLUME-2923 - Bump asynchbase version to 1.7.0
  • FLUME-2934 - Document new cachePatternMatching option for TaildirSource
  • FLUME-2935 - Bump java target version to 1.7
  • FLUME-2948 - docs: Fix parameters on Replicating Channel Selector example
  • FLUME-2954 - Make raw data appearing in log messages explicit
  • FLUME-2963 - FlumeUserGuide: Fix error in Kafka Source properties table
  • FLUME-2972 - Handle offset migration in the new Kafka Channel
  • FLUME-2975 - docs: Fix NetcatSource example
  • FLUME-2982 - Add localhost escape sequence to HDFS sink
  • FLUME-2983 - Handle offset migration in the new Kafka Source
  • FLUME-2999 - Kafka channel and sink should enable statically assigned partition per event via header
  • FLUME-3020 - Improve HDFS Sink escape sequence substitution
  • FLUME-3027 - Change Kafka Channel to clear offsets map after commit
  • FLUME-3031 - Change sequence source to reset its counter for event body on channel exception
  • FLUME-3049 - Make HDFS sink rotate more reliably in secure mode
  • HADOOP-7930 - Kerberos relogin interval in UserGroupInformation should be configurable
  • HADOOP-8436 - NPE In getLocalPathForWrite ( path, conf ) when the required context item is not configured
  • HADOOP-8437 - getLocalPathForWrite should throw IOException for invalid paths
  • HADOOP-8934 - Shell command ls should include sort options
  • HADOOP-10048 - LocalDirAllocator should avoid holding locks while accessing the filesystem
  • HADOOP-10300 - Allowed deferred sending of call responses.
  • HADOOP-10971 - Add -C flag to make `hadoop fs -ls` print filenames only
  • HADOOP-11031 - Design Document for Credential Provider API
  • HADOOP-11361 - Fix a race condition in MetricsSourceAdapter.updateJmxCache
  • HADOOP-11400 - GraphiteSink does not reconnect to Graphite after 'broken pipe'
  • HADOOP-11469 - KMS should skip default.key.acl and whitelist.key.acl when loading key acl.
  • HADOOP-11599 - Client#getTimeout should use IPC_CLIENT_PING_DEFAULT when IPC_CLIENT_PING_KEY is not configured
  • HADOOP-11619 - FTPFileSystem should override getDefaultPort.
  • HADOOP-11901 - BytesWritable fails to support 2G chunks due to integer overflow
  • HADOOP-12252 - LocalDirAllocator should not throw NPE with empty string configuration
  • HADOOP-12453 - Support decoding KMS Delegation Token with its own Identifier
  • HADOOP-12483 - Maintain wrapped SASL ordering for postponed IPC responses.
  • HADOOP-12537 - S3A to support Amazon STS temporary credentials
  • HADOOP-12548 - Read s3a creds from a Credential Provider
  • HADOOP-12609 - Fix intermittent failure of TestDecayRpcScheduler.
  • HADOOP-12655 - TestHttpServer.testBindAddress bind port range is wider than expected.
  • HADOOP-12659 - Incorrect usage of config parameters in token manager of KMS
  • HADOOP-12672 - RPC timeout should not override IPC ping interval
  • HADOOP-12723 - S3A: Add ability to plug in any AWSCredentialsProvider
  • HADOOP-12963 - Allow using path style addressing for accessing the s3 endpoint.
  • HADOOP-12973 - Make DU pluggable.
  • HADOOP-12974 - Create a CachingGetSpaceUsed implementation that uses df
  • HADOOP-12975 - Add jitter to CachingGetSpaceUsed's thread
  • HADOOP-13034 - Log message about input options in distcp lacks some items
  • HADOOP-13072 - WindowsGetSpaceUsed constructor should be public
  • HADOOP-13079 - Add -q option to Ls to print ? instead of non-printable characters
  • HADOOP-13132 - Handle ClassCastException on AuthenticationException in LoadBalancingKMSClientProvider
  • HADOOP-13155 - Implement TokenRenewer to renew and cancel delegation tokens in KMS
  • HADOOP-13189 - FairCallQueue makes callQueue larger than the configured capacity
  • HADOOP-13251 - Authenticate with Kerberos credentials when renewing KMS delegation token
  • HADOOP-13255 - KMSClientProvider should check and renew tgt when doing delegation token operations
  • HADOOP-13263 - Reload cached groups in background after expiry.
  • HADOOP-13270 - BZip2CompressionInputStream finds the same compression marker twice in corner case, causing duplicate data blocks
  • HADOOP-13317 - Add logs to KMS server-side to improve supportability
  • HADOOP-13353 - LdapGroupsMapping getPassward shouldn't return null when IOException throws
  • HADOOP-13381 - KMS clients should use KMS Delegation Tokens from current UGI
  • HADOOP-13433 - Race in UGI.reloginFromKeytab
  • HADOOP-13434 - Add bash quoting to Shell class.
  • HADOOP-13437 - KMS should reload whitelist and default key ACLs when hot-reloading
  • HADOOP-13457 - Remove hardcoded absolute path for shell executable.
  • HADOOP-13487 - Hadoop KMS should load old delegation tokens from Zookeeper on startup
  • HADOOP-13503 - Improve SaslRpcClient failure logging
  • HADOOP-13526 - Add detailed logging in KMS for the authentication failure of proxy user
  • HADOOP-13558 - UserGroupInformation created from a Subject incorrectly tries to renew the Kerberos ticket
  • HADOOP-13579 - Fix source-level compatibility after HADOOP-11252
  • HADOOP-13590 - Retry until TGT expires even if the UGI renewal thread encountered exception.
  • HADOOP-13627 - Have an explicit KerberosAuthException for UGI to throw, text from public constants
  • HADOOP-13638 - KMS should set UGI's Configuration object properly
  • HADOOP-13641 - Update UGI#spawnAutoRenewalThreadForUserCreds to reduce indentation
  • HADOOP-13669 - Addendum patch 2 for KMS Server should log exceptions before throwing.
  • HADOOP-13693 - Remove the message about HTTP OPTIONS in SPNEGO initialization message from kms audit log.
  • HADOOP-13749 - KMSClientProvider combined with KeyProviderCache can result in wrong UGI being used
  • HADOOP-13805 - UGI.getCurrentUser() fails if user does not have a keytab associated
  • HADOOP-13838 - KMSTokenRenewer should close providers
  • HADOOP-13953 - Make FTPFileSystem's data connection mode and transfer mode configurable
  • HADOOP-14003 - Make additional KMS tomcat settings configurable
  • HADOOP-14104 - Client should always ask namenode for kms provider path
  • HADOOP-14195 - CredentialProviderFactory$getProviders is not thread-safe
  • HDFS-4176 - EditLogTailer should call rollEdits with a timeout.
  • HDFS-4210 - Throw helpful exception when DNS entry for JournalNode cannot be resolved
  • HDFS-6434 - Default permission for creating file should be 644 for WebHdfs/HttpFS
  • HDFS-6962 - ACLs inheritance conflict with umaskmode
  • HDFS-7413 - Some unit tests should use NameNodeProtocols instead of FSNameSystem
  • HDFS-7415 - Move FSNameSystem.resolvePath() to FSDirectory
  • HDFS-7420 - Delegate permission checks to FSDirectory
  • HDFS-7463 - Simplify FSNamesystem#getBlockLocationsUpdateTimes
  • HDFS-7478 - Move org.apache.hadoop.hdfs.server.namenode.NNConf to FSNamesystem
  • HDFS-7517 - Remove redundant non-null checks in FSNamesystem#getBlockLocations
  • HDFS-7597 - DelegationTokenIdentifier should cache the TokenIdentifier to UGI mapping
  • HDFS-7964 - Add support for async edit logging
  • HDFS-8224 - Schedule a block for scanning if its metadata file is corrupt
  • HDFS-8269 - getBlockLocations() does not resolve the .reserved path and generates incorrect edit logs when updating the atime
  • HDFS-8581 - ContentSummary on / skips further counts on yielding lock
  • HDFS-8709 - Clarify automatic sync in FSEditLog#logEdit.
  • HDFS-8829 - Make SO_RCVBUF and SO_SNDBUF size configurable for DataTransferProtocol sockets and allow configuring auto-tuning
  • HDFS-8897 - Balancer should handle fs.defaultFS trailing slash in HA
  • HDFS-9038 - DFS reserved space is erroneously counted towards non-DFS used.
  • HDFS-9085 - Show renewer information in DelegationTokenIdentifier#toString
  • HDFS-9137 - DeadLock between DataNode#refreshVolumes and BPOfferService#registrationSucceeded.
  • HDFS-9141 - Thread leak in Datanode#refreshVolumes.
  • HDFS-9259 - Make SO_SNDBUF size configurable at DFSClient side for hdfs write scenario.
  • HDFS-9276 - Failed to Update HDFS Delegation Token for long running application in HA mode
  • HDFS-9365 - Balaner does not work with the HDFS-6376 HA setup.
  • HDFS-9461 - DiskBalancer: Add Report Command
  • HDFS-9530 - ReservedSpace is not cleared for abandoned Blocks
  • HDFS-9601 - NNThroughputBenchmark.BlockReportStats should handle NotReplicatedYetException on adding block.
  • HDFS-9630 - DistCp minor refactoring and clean up
  • HDFS-9638 - Improve DistCp Help and documentation.
  • HDFS-9700 - DFSClient and DFSOutputStream should set TCP_NODELAY on sockets for DataTransferProtocol
  • HDFS-9732 - Improve DelegationTokenIdentifier.toString() for better logging
  • HDFS-9781 - FsDatasetImpl#getBlockReports can occasionally throw NullPointerException
  • HDFS-9805 - Add server-side configuration for enabling TCP_NODELAY for DataTransferProtocol and default it to true
  • HDFS-9820 - Improve distcp to support efficient restore to an earlier snapshot
  • HDFS-9906 - Remove spammy log spew when a datanode is restarted.
  • HDFS-9939 - Increase DecompressorStream skip buffer size
  • HDFS-9958 - BlockManager#createLocatedBlocks can throw NPE for corruptBlocks on failed storages
  • HDFS-10216 - Distcp -diff throws exception when handling relative path
  • HDFS-10270 - TestJMXGet:testNameNode() fails
  • HDFS-10298 - Document the usage of distcp -diff option
  • HDFS-10312 - Large block reports may fail to decode at NameNode due to 64 MB protobuf maximum length restriction
  • HDFS-10313 - Distcp need to enforce the order of snapshot names passed to -diff.
  • HDFS-10336 - TestBalancer failing intermittently because of not reseting UserGroupInformation completely
  • HDFS-10381 - DataStreamer DataNode exclusion log message should be warning.
  • HDFS-10397 - Distcp should ignore -delete option if -diff option is provided instead of exiting
  • HDFS-10403 - DiskBalancer: Add cancel command
  • HDFS-10457 - DataNode should not auto-format block pool directory if VERSION is missing.
  • HDFS-10481 - HTTPFS server should correctly impersonate as end user to open file
  • HDFS-10500 - Diskbalancer: Print out information when a plan is not generated
  • HDFS-10501 - DiskBalancer: Use the default datanode port if port is not provided
  • HDFS-10512 - VolumeScanner may terminate due to NPE in DataNode.reportBadBlocks
  • HDFS-10516 - Fix bug when warming up EDEK cache of more than one encryption zone
  • HDFS-10517 - DiskBalancer: Support help command
  • HDFS-10525 - Fix NPE in CacheReplicationMonitor#rescanCachedBlockMap
  • HDFS-10541 - Diskbalancer: When no actions in plan, error message says "Plan was generated more than 24 hours ago"
  • HDFS-10544 - Balancer doesn't work with IPFailoverProxyProvider.
  • HDFS-10552 - DiskBalancer "-query" results in NPE if no plan for the node
  • HDFS-10556 - DistCpOptions should be validated automatically
  • HDFS-10559 - DiskBalancer: Use SHA1 for Plan ID
  • HDFS-10567 - Improve plan command help message
  • HDFS-10588 - False alarm in datanode log - ERROR - Disk Balancer is not enabled
  • HDFS-10598 - DiskBalancer does not execute multi-steps plan
  • HDFS-10600 - PlanCommand#getThrsholdPercentage should not use throughput value.
  • HDFS-10609 - Uncaught InvalidEncryptionKeyException during pipeline recovery may abort downstream applications
  • HDFS-10641 - TestBlockManager#testBlockReportQueueing fails intermittently.
  • HDFS-10643 - Namenode should use loginUser(hdfs) to generateEncryptedKey
  • HDFS-10681 - DiskBalancer: query command should report Plan file path apart from PlanID.
  • HDFS-10715 - NPE when applying AvailableSpaceBlockPlacementPolicy
  • HDFS-10722 - Fix race condition in TestEditLog#testBatchedSyncWithClosedLogs
  • HDFS-10760 - DataXceiver#run() should not log InvalidToken exception as an error
  • HDFS-10763 - Open files can leak permanently due to inconsistent lease update
  • HDFS-10822 - Log DataNodes in the write pipeline.
  • HDFS-10879 - TestEncryptionZonesWithKMS#testReadWrite fails intermittently
  • HDFS-10963 - Reduce log level when network topology cannot find enough datanodes
  • HDFS-11012 - Unnecessary INFO logging on DFSClients for InvalidToken
  • HDFS-11040 - Add documentation for HDFS-9820 distcp improvement
  • HDFS-11056 - Concurrent append and read operations lead to checksum error
  • HDFS-11160 - VolumeScanner reports write-in-progress replicas as corrupt incorrectly
  • HDFS-11229 - HDFS-11056 failed to close meta file
  • HDFS-11275 - Check groupEntryIndex and throw a helpful exception on failures when removing ACL.
  • HDFS-11292 - log lastWrittenTxId etc info in logSyncAll
  • HDFS-11306 - Print remaining edit logs from buffer if edit log can't be rolled
  • HDFS-11363 - Need more diagnosis info when seeing Slow waitForAckedSeqno.
  • HDFS-11379 - DFSInputStream may infinite loop requesting block locations
  • HDFS-11689 - New exception thrown by DFSClient%isHDFSEncryptionEnabled broke hacky hive code
  • MAPREDUCE-4784 - TestRecovery occasionally fails
  • MAPREDUCE-6172 - TestDbClasses timeouts are too aggressive
  • MAPREDUCE-6359 - In RM HA setup, Cluster tab links populated with AM hostname instead of RM
  • MAPREDUCE-6442 - Stack trace is missing when error occurs in client protocol provider's constructor
  • MAPREDUCE-6473 - Job submission can take a long time during Cluster initialization
  • MAPREDUCE-6571 - JobEndNotification info logs are missing in AM container syslog
  • MAPREDUCE-6628 - Potential memory leak in CryptoOutputStream
  • MAPREDUCE-6633 - AM should retry map attempts if the reduce task encounters commpression related errors
  • MAPREDUCE-6641 - TestTaskAttempt fails in trunk
  • MAPREDUCE-6670 - TestJobListCache#testEviction sometimes fails on Windows with timeout
  • MAPREDUCE-6680 - JHS UserLogDir scan algorithm sometime could skip directory with update in CloudFS (Azure FileSystem, S3, etc
  • MAPREDUCE-6718 - add progress log to JHS during startup
  • MAPREDUCE-6728 - Give fetchers hint when ShuffleHandler rejects a shuffling connection
  • MAPREDUCE-6738 - TestJobListCache.testAddExisting failed intermittently in slow VM testbed
  • MAPREDUCE-6761 - Regression when handling providers - invalid configuration ServiceConfiguration causes Cluster initialization failure
  • MAPREDUCE-6763 - Shuffle server listen queue is too small
  • MAPREDUCE-6771 - RMContainerAllocator sends container diagnostics event after corresponding completion event
  • MAPREDUCE-6798 - Fix intermittent failure of TestJobHistoryParsing.testJobHistoryMethods
  • MAPREDUCE-6817 - The format of job start time in JHS is different from those of submit and finish time.
  • MAPREDUCE-6839 - TestRecovery.testCrashed failed
  • YARN-2306 - Add test for leakage of reservation metrics in fair scheduler..
  • YARN-2336 - Fair scheduler's REST API returns a missing '[' bracket JSON for deep queue tree
  • YARN-2605 - [RM HA] Rest api endpoints doing redirect incorrectly.
  • YARN-2977 - Fixed intermittent TestNMClient failure.
  • YARN-3251 - Fixed a deadlock in CapacityScheduler when computing absoluteMaxAvailableCapacity in LeafQueue
  • YARN-3601 - Fix UT TestRMFailover.testRMWebAppRedirect
  • YARN-3654 - ContainerLogsPage web UI should not have meta-refresh
  • YARN-3722 - Merge multiple TestWebAppUtils into o.a.h.yarn.webapp.util.TestWebAppUtils
  • YARN-3933 - FairScheduler: Multiple calls to completedContainer are not safe.
  • YARN-3957 - FairScheduler NPE In FairSchedulerQueueInfo causing scheduler page to return 500.
  • YARN-4004 - container-executor should print output of docker logs if the docker container exits with non-0 exit status
  • YARN-4017 - container-executor overuses PATH_MAX
  • YARN-4092 - Addendum. Fixed UI redirection to print useful messages when both RMs are in standby mode
  • YARN-4245 - Generalize config file handling in container-executor
  • YARN-4255 - container-executor does not clean up docker operation command files
  • YARN-4363 - In TestFairScheduler, testcase should not create FairScheduler redundantly.
  • YARN-4411 - RMAppAttemptImpl#createApplicationAttemptReport throws IllegalArgumentException
  • YARN-4459 - container-executor should only kill process groups
  • YARN-4544 - All the log messages about rolling monitoring interval are shown with WARN level
  • YARN-4555 - TestDefaultContainerExecutor#testContainerLaunchError fails on non-english locale environment
  • YARN-4556 - TestFifoScheduler.testResourceOverCommit fails
  • YARN-4820 - ResourceManager web redirects in HA mode drops query parameters
  • YARN-4866 - FairScheduler: AMs can consume all vcores leading to a livelock when using FAIR policy.
  • YARN-4878 - Expose scheduling policy and max running apps over JMX for Yarn queues.
  • YARN-4940 - yarn node -list -all failed if RM start with decommissioned node
  • YARN-4989 - TestWorkPreservingRMRestart#testCapacitySchedulerRecovery fails intermittently
  • YARN-5001 - Aggregated Logs root directory is created with wrong group if nonexistent
  • YARN-5048 - DelegationTokenRenewer#skipTokenRenewal may throw NPE
  • YARN-5077 - Fix FSLeafQueue#getFairShare() for queues with zero fairshare.
  • YARN-5107 - TestContainerMetrics fails.
  • YARN-5136 - Error in handling event type APP_ATTEMPT_REMOVED to the scheduler
  • YARN-5246 - NMWebAppFilter web redirects drop query parameters
  • YARN-5272 - Handle queue names consistently in FairScheduler.
  • YARN-5608 - TestAMRMClient.setup() fails with ArrayOutOfBoundsException
  • YARN-5704 - Provide config knobs to control enabling/disabling new/work in progress features in container-executor
  • YARN-5752 - TestLocalResourcesTrackerImpl#testLocalResourceCache times out
  • YARN-5837 - NPE when getting node status of a decommissioned node after an RM restart
  • YARN-5859 - TestResourceLocalizationService#testParallelDownloadAttemptsForPublicResource sometimes fails
  • YARN-5862 - TestDiskFailures.testLocalDirsFailures failed
  • YARN-5890 - FairScheduler should log information about AM-resource-usage and max-AM-share for queues
  • YARN-5920 - Fix deadlock in TestRMHA.testTransitionedToStandbyShouldNotHang
  • YARN-6042 - Dump scheduler and queue state information into FairScheduler DEBUG log.
  • YARN-6042 - Revert "Dump scheduler and queue state information into FairScheduler DEBUG log."
  • YARN-6042 - Dump scheduler and queue state information into FairScheduler DEBUG log.
  • YARN-6151 - FS preemption does not consider child queues over fairshare if the parent is under.
  • YARN-6175 - FairScheduler: Negative vcore for resource needed to preempt.
  • YARN-6264 - AM not launched when a single vcore is available on the cluster.
  • YARN-6359 - TestRM#testApplicationKillAtAcceptedState fails rarely due to race condition
  • YARN-6360 - Prevent FS state dump logger from cramming other log files
  • YARN-6453 - fairscheduler-statedump.log gets generated regardless of service
  • HBASE-12949 - Scanner can be stuck in infinite loop if the HFile is corrupted
  • HBASE-14644 - Region in transition metric is broken
  • HBASE-14818 - user_permission does not list namespace permissions
  • HBASE-14963 - Remove use of Guava Stopwatch from HBase client code
  • HBASE-15125 - BackportHBaseFsck's adoptHdfsOrphan function creates region with wrong end key boundary
  • HBASE-15324 - Jitter may cause desiredMaxFileSize overflow in ConstantSizeRegionSplitPolicy and trigger unexpected split
  • HBASE-15328 - sanity check the redirect used to send master info requests to the embedded regionserver.
  • HBASE-15378 - Scanner cannot handle heartbeat message with no results
  • HBASE-15430 - Failed taking snapshot - Manifest proto-message too large
  • HBASE-15465 - userPermission returned by getUserPermission() for the selected namespace does not have namespace set
  • HBASE-15496 - Throw RowTooBigException only for user scan/get
  • HBASE-15587 - FSTableDescriptors.getDescriptor() logs stack trace erronously
  • HBASE-15613 - TestNamespaceCommand times out
  • HBASE-15621 - Suppress Hbase SnapshotHFile cleaner error messages when a snaphot is going on
  • HBASE-15683 - Min latency in latency histograms are emitted as Long.MAX_VALUE
  • HBASE-15698 - Increment TimeRange not serialized to server
  • HBASE-15746 - Remove extra RegionCoprocessor preClose() in RSRpcServices#closeRegion
  • HBASE-15808 - Reduce potential bulk load intermediate space usage and waste
  • HBASE-15856 - Don't cache unresolved addresses for connections
  • HBASE-15873 - ACL for snapshot restore / clone is not enforced
  • HBASE-15925 - provide default values for hadoop compat module related properties that match default hadoop profile.
  • HBASE-15931 - Add log for long-running tasks in AsyncProcess
  • HBASE-15955 - Disable action in CatalogJanitor#setEnabled should wait for active cleanup scan to finish
  • HBASE-16032 - Possible memory leak in StoreScanner
  • HBASE-16056 - Procedure v2 - fix master crash for FileNotFound
  • HBASE-16062 - Improper error handling in WAL Reader/Writer creation
  • HBASE-16093 - Fix splits failed before creating daughter regions leave meta inconsistent
  • HBASE-16135 - PeerClusterZnode under rs of removed peer may never be deleted
  • HBASE-16146 - Counters are expensive...
  • HBASE-16172 - Unify the retry logic in ScannerCallableWithReplicas and RpcRetryingCallerWithReadReplicas
  • HBASE-16194 - Should count in MSLAB chunk allocation into heap size change when adding duplicate cells
  • HBASE-16195 - Should not add chunk into chunkQueue if not using chunk pool in HeapMemStoreLAB
  • HBASE-16207 - can't restore snapshot without "Admin" permission
  • HBASE-16227 - [Shell] Column value formatter not working in scans. Tested : manually using shell.
  • HBASE-16238 - It's useless to catch SESSIONEXPIRED exception and retry in RecoverableZooKeeper
  • HBASE-16270 - Handle duplicate clearing of snapshot in region replicas
  • HBASE-16284 - Unauthorized client can shutdown the cluster
  • HBASE-16288 - HFile intermediate block level indexes might recurse forever creating multi TB files
  • HBASE-16294 - hbck reporting "No HDFS region dir found" for replicas
  • HBASE-16304 - HRegion#RegionScannerImpl#handleFileNotFoundException may lead to deadlock when trying to obtain write lock on updatesLock
  • HBASE-16317 - revert all ESAPI changes
  • HBASE-16319 - Fix TestCacheOnWrite after HBASE-16288
  • HBASE-16321 - ensure no findbugs-jsr305
  • HBASE-16340 - exclude Xerces iplementation jars from coming in transitively.
  • HBASE-16345 - RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
  • HBASE-16350 - Undo server abort from HBASE-14968
  • HBASE-16360 - TableMapReduceUtil addHBaseDependencyJars has the wrong class name for PrefixTreeCodec
  • HBASE-16429 - FSHLog: deadlock if rollWriter called when ring buffer filled with appends
  • HBASE-16460 - Can't rebuild the BucketAllocator's data structures when BucketCache uses FileIOEngine
  • HBASE-16604 - Scanner retries on IOException can cause the scans to miss data
  • HBASE-16662 - Fix open POODLE vulnerabilities
  • HBASE-16699 - Overflows in AverageIntervalRateLimiter's refill() and getWaitInterval()
  • HBASE-16721 - Concurrency issue in WAL unflushed seqId tracking
  • HBASE-16767 - Mob compaction needs to clean up files in /hbase/mobdir/.tmp and /hbase/mobdir/.tmp/.bulkload when running into IO exceptions
  • HBASE-16807 - , RegionServer will fail to report new active Hmaster until HMaster/RegionServer failover.
  • HBASE-16824 - Writer.flush() can be called on already closed streams in WAL roll
  • HBASE-16841 - Data loss in MOB files after cloning a snapshot and deleting that snapshot
  • HBASE-16931 - Setting cell's seqId to zero in compaction flow might cause RS down.
  • HBASE-16960 - RegionServer hang when aborting
  • HBASE-17020 - keylen in midkey() dont computed correctly
  • HBASE-17023 - Region left unassigned due to AM and SSH each thinking others would do the assignment work
  • HBASE-17044 - Fix merge failed before creating merged region leaves meta inconsistent
  • HBASE-17058 - Lower epsilon used for jitter verification from HBASE-15324
  • HBASE-17069 - RegionServer writes invalid META entries for split daughters in some circumstances
  • HBASE-17072 - CPU usage starts to climb up to 90-100% when using G1GC; purge ThreadLocal usage
  • HBASE-17206 - FSHLog may roll a new writer successfully with unflushed entries
  • HBASE-17241 - Avoid compacting already compacted mob files with _del files
  • HBASE-17265 - Region left unassigned in master failover when region failed to open
  • HBASE-17275 - Assign timeout may cause region to be unassigned forever
  • HBASE-17328 - Properly dispose of looped replication peers
  • HBASE-17381 - ReplicationSourceWorkerThread can die due to unhandled exceptions
  • HBASE-17409 - Limit jsonp callback name to prevent xss
  • HBASE-17452 - Failed taking snapshot - region Manifest proto-message too large
  • HBASE-17522 - Handle JVM throwing runtime exceptions when we ask for details on heap usage the same as a correctly returned 'undefined'.
  • HBASE-17558 - ZK dumping jsp should escape HTML.
  • HBASE-17561 - table status page should escape values that may contain arbitrary characters.
  • HBASE-17675 - ReplicationEndpoint should choose new sinks if a SaslException occurs
  • HBASE-17717 - Explicitly use "sasl" ACL scheme for hbase superuser
  • HIVE-6758 - Beeline doesn't work with -e option when started in background
  • HIVE-7443 - Fix HiveConnection to communicate with Kerberized Hive JDBC server and alternative JDKs
  • HIVE-7723 - Explain plan for complex query with lots of partitions is slow due to in-efficient collection used to find a matching ReadEntity
  • HIVE-10007 - Support qualified table name in analyze table compute statistics for columns
  • HIVE-10384 - BackportRetryingMetaStoreClient does not retry wrapped TTransportExceptions
  • HIVE-10728 - deprecate unix_timestamp(void) and make it deterministic
  • HIVE-10965 - direct SQL for stats fails in 0-column case
  • HIVE-11028 - Tez: table self join and join with another table fails with IndexOutOfBoundsException
  • HIVE-11141 - Improve RuleRegExp when the Expression node stack gets huge
  • HIVE-11243 - Changing log level in Utilities.getBaseWork
  • HIVE-11375 - Broken processing of queries containing NOT (x IS NOT NULL and x <> 0)
  • HIVE-11428 - Performance: Struct IN() clauses are extremely slow
  • HIVE-11432 - Hive macro give same result for different arguments
  • HIVE-11487 - Add getNumPartitionsByFilter api in metastore api
  • HIVE-11594 - Analyze Table for column names with embedded spaces
  • HIVE-11671 - Optimize RuleRegExp in DPP codepath
  • HIVE-11717 - nohup mode is not support for new hive
  • HIVE-11747 - Unnecessary error log is shown when executing a "INSERT OVERWRITE LOCAL DIRECTORY" cmd in the embedded mode
  • HIVE-11827 - STORED AS AVRO fails SELECT COUNT(*) when empty
  • HIVE-11842 - Improve RuleRegExp by caching some internal data structures
  • HIVE-11849 - NPE in HiveHBaseTableShapshotInputFormat in query with just count(*)
  • HIVE-11901 - StorageBasedAuthorizationProvider requires write permission on table for SELECT statements
  • HIVE-11980 - Follow up on HIVE-11696, exception is thrown from CTAS from the table with table-level serde is Parquet while partition-level serde is JSON
  • HIVE-12077 - MSCK Repair table should fix partitions in batches
  • HIVE-12083 - HIVE-10965 introduces thrift error if partNames or colNames are empty
  • HIVE-12179 - Add option to not add spark-assembly.jar to Hive classpath
  • HIVE-12277 - Hive macro results on macro_duplicate.q different after adding ORDER BY
  • HIVE-12349 - NPE in ORC SARG for IS NULL queries on Timestamp and Date columns
  • HIVE-12465 - Hive might produce wrong results when (outer) joins are merged
  • HIVE-12475 - Parquet schema evolution within array<struct<>> doesn't work
  • HIVE-12556 - Ctrl-C in beeline doesn't kill Tez query on HS2
  • HIVE-12619 - Switching the field order within an array of structs causes the query to fail
  • HIVE-12635 - Hive should return the latest hbase cell timestamp as the row timestamp value
  • HIVE-12768 - Thread safety: binary sortable serde decimal deserialization
  • HIVE-12780 - Fix the output of the history command in Beeline
  • HIVE-12785 - View with union type and UDF to the struct is broken
  • HIVE-12834 - Fix to accept the arrow keys in BeeLine CLI
  • HIVE-12891 - Hive fails when is set to a relative location
  • HIVE-12976 - MetaStoreDirectSql doesn't batch IN lists in all cases
  • HIVE-13043 - Reload function has no impact to function registry
  • HIVE-13058 - Add session and operation_log directory deletion messages
  • HIVE-13090 - Hive metastore crashes on NPE with ZooKeeperTokenStore
  • HIVE-13129 - CliService leaks HMS connection
  • HIVE-13149 - Remove some unnecessary HMS connections from HS2
  • HIVE-13198 - Authorization issues with cascading views
  • HIVE-13237 - Select parquet struct field with upper case throws NPE
  • HIVE-13240 - GroupByOperator: Drop the hash aggregates when closing operator
  • HIVE-13372 - Hive Macro overwritten when multiple macros are used in one column
  • HIVE-13381 - Timestamp & date should have precedence in type hierarchy than string group
  • HIVE-13429 - Tool to remove dangling scratch dir
  • HIVE-13462 - HiveResultSetMetaData.getPrecision() fails for NULL columns
  • HIVE-13539 - HiveHFileOutputFormat searching the wrong directory for HFiles
  • HIVE-13590 - Kerberized HS2 with LDAP auth enabled fails in multi-domain LDAP case
  • HIVE-13620 - Merge llap branch work to master
  • HIVE-13625 - Hive Prepared Statement when executed with escape characters in parameter fails
  • HIVE-13645 - Beeline needs null-guard around hiveVars and hiveConfVars read
  • HIVE-13704 - Don't call DistCp.execute() instead of
  • HIVE-13736 - View's input/output formats are TEXT by default.
  • HIVE-13749 - Memory leak in Hive Metastore
  • HIVE-13864 - Beeline ignores the command that follows a semicolon and comment
  • HIVE-13866 - flatten callstack for directSQL errors
  • HIVE-13884 - Disallow queries in HMS fetching more than a configured number of partitions
  • HIVE-13895 - HoS start-up overhead in yarn-client mode
  • HIVE-13932 - Hive SMB Map Join with small set of LIMIT failed with NPE
  • HIVE-13936 - Add streaming support for row_number
  • HIVE-13953 - Issues in HiveLockObject equals method
  • HIVE-13991 - Union All on view fail with no valid permission on underneath table
  • HIVE-13997 - Insert overwrite directory doesn't overwrite existing files
  • HIVE-14006 - Hive query with UNION ALL fails with ArrayIndexOutOfBoundsException.
  • HIVE-14015 - SMB MapJoin failed for Hive on Spark when kerberized
  • HIVE-14037 - java.lang. ClassNotFoundException for the jar in hive.reloadable.aux.jars.path in mapreduce
  • HIVE-14055 - directSql - getting the number of partitions is broken
  • HIVE-14098 - Logging task properties, and environment variables might contain passwords
  • HIVE-14118 - Make the alter partition exception more meaningful
  • HIVE-14137 - Hive on Spark throws FileAlreadyExistsException for jobs with multiple empty tables
  • HIVE-14142 - java.lang. ClassNotFoundException for the jar in hive.reloadable.aux.jars.path for Hive on Spark
  • HIVE-14173 - NPE was thrown after enabling directsql in the middle of session
  • HIVE-14187 - JDOPersistenceManager objects remain cached if MetaStoreClient#close is not called
  • HIVE-14198 - Refactor aux jar related code to make them more consistent
  • HIVE-14209 - Add some logging info for session and operation management
  • HIVE-14210 - ExecDriver should call jobclient.close() to trigger cleanup
  • HIVE-14296 - Session count is not decremented when HS2 clients do not shutdown cleanly.
  • HIVE-14342 - Beeline output is garbled when executed from a remote shell
  • HIVE-14383 - SparkClientImpl should pass principal and keytab to spark-submit instead of calling kinit explicitely
  • HIVE-14421 - FS.deleteOnExit holds references to _tmp_space.db files.
  • HIVE-14229 - The jars in hive. aux.jar.paths are not added to session classpath
  • HIVE-14436 - Hive 1.2.1/Hitting "ql.Driver: FAILED: IllegalArgumentException Error: , expected at the end of 'decimal(9'" after enabling hive.optimize.skewjoin and with MR engine
  • HIVE-14457 - Partitions in encryption zone are still trashed though an exception is returned
  • HIVE-14519 - Multi insert query bug
  • HIVE-14538 - beeline throws exceptions with parsing hive config when using !sh statement
  • HIVE-14693 - Some paritions will be left out when partition number is the multiple of the option
  • HIVE-14697 - Can not access kerberized HS2 Web UI
  • HIVE-14715 - Hive throws NumberFormatException with query with Null value
  • HIVE-14743 - ArrayIndexOutOfBoundsException - HBASE-backed views' query with JOINs
  • HIVE-14762 - Add logging while removing scratch space
  • HIVE-14764 - Enabling "hive.metastore.metrics.enabled" throws OOM in HiveMetastore
  • HIVE-14784 - Operation logs are disabled automatically if the parent directory does not exist.
  • HIVE-14799 - Query operation are not thread safe during its cancellation
  • HIVE-14805 - Subquery inside a view will have the object in the subquery as the direct input
  • HIVE-14817 - Shutdown the SessionManager timeoutChecker thread properly upon shutdown.
  • HIVE-14819 - FunctionInfo for permanent functions shows TEMPORARY FunctionType
  • HIVE-14820 - RPC server for spark inside HS2 is not getting server address properly
  • HIVE-15054 - Hive insertion query execution fails on Hive on Spark
  • HIVE-15061 - Metastore types are sometimes case sensitive
  • HIVE-15090 - Temporary DB failure can stop ExpiredTokenRemover thread
  • HIVE-15231 - query on view with CTE and alias fails with table not found error
  • HIVE-15291 - Comparison of timestamp fails if only date part is provided
  • HIVE-15338 - Wrong result from non-vectorized DATEDIFF with scalar parameter of type DATE/TIMESTAMP
  • HIVE-15346 - "values temp table" should not be an input
  • HIVE-15410 - WebHCat supports get/set table property with its name containing period and hyphen
  • HIVE-15517 - NOT (x <=> y) returns NULL if x or y is NULL
  • HIVE-15551 - memory leak in directsql for mysql+bonecp specific initialization
  • HIVE-15572 - Improve the response time for query canceling when it happens during acquiring locks
  • HIVE-15735 - In some cases, view objects inside a view do not have parents.
  • HIVE-15782 - query on parquet table returns incorrect result when hive.optimize.index.filter is set to true
  • HIVE-15872 - The PERCENTILE_APPROX UDAF does not work with empty set
  • HIVE-15997 - Resource leaks when query is cancelled
  • HIVE-16047 - Shouldn't try to get KeyProvider unless encryption is enabled
  • HIVE-16156 - FileSinkOperator should delete existing output target when renaming
  • HIVE-16175 - Possible race condition in InstanceCache
  • HIVE-16394 - HoS does not support queue name change in middle of session
  • HUE-3065 - [oozie] Sub-workflow submitted from coordinator gets parent workflow graph
  • HUE-3079 - [oozie] Some links of a Fork can point to deleted nodes
  • HUE-4147 - [useradmin] Ignore (objectclass=*) filter when searching for LDAP users
  • HUE-4386 - [oozie] Remove oozie.coord.application.path from properties when rerunning workflow
  • HUE-4462 - [oozie] Fix deployement_dir for the bundle in oozie example fixtures
  • HUE-4706 - [useradmin] update AuthenticationForm to allow activated users to login
  • HUE-4921 - [core] Skip idle session timeout relogin popup on running jb jobs call when idle session timeout is disabled
  • HUE-4941 - [jobbrowser] Unable to kill jobs with Resource Manager HA enabled
  • HUE-4969 - [security] Can't type any / in the HDFS ACLs path input
  • HUE-5158 - [editor] Older queries after upgrade do not provide direct save
  • HUE-5390 - [editor] Improve import testing of beeswax queries to notebook format
  • HUE-5659 - [editor] Add max limit of rows before truncation in the export / download query result
  • HUE-5679 - [yarn] Reset API_CACHE on logout
  • HUE-5714 - [yarn] Fix unittest for MR API Cache
  • HUE-6090 - [search] Typing in the search bar always redirect to the end of the input
  • HUE-6131 - [metastore] No information surfaced when LOAD data from Create table from file fails
  • HUE-6133 - [editor] Horizontal scrollbar can be hidden under the first fixed column
  • HUE-6144 - [editor] Make it possible to turn autocomplete on or off
  • HUE-6197 - [editor] Enable scrolling past the end of the editor
  • HUE-6228 - [editor] API for progress status and truncating warning when direct downloading results as Excel
  • IMPALA-1346 - /1590/2344: fix sorter buffer mgmt when spilling
  • IMPALA-1619IMPALA-3018: Address various small memory allocation related bugs
  • IMPALA-1619 - Support 64-bit allocations.
  • IMPALA-1657 - Rework detection and reporting of corrupt table stats.
  • IMPALA-2864 - Ensure that client connections are closed after a failed Open()
  • IMPALA-3018 - Don't return NULL on zero length allocations.
  • IMPALA-3159 - impala-shell does not accept wildcard or SAN certificates
  • IMPALA-3167 - Fix assignment of WHERE conjunct through grouping agg + OJ.
  • IMPALA-3314 - Fix Avro schema loading for partitioned tables.
  • IMPALA-3344 - Simplify sorter and document/enforce invariants.
  • IMPALA-3441,IMPALA-3659: check for malformed Avro data
  • IMPALA-3499 - Split catalog update
  • IMPALA-3552 - Make incremental stats max serialized size configurable
  • IMPALA-3575 - Add retry to backend connection request and rpc timeout
  • IMPALA-3628 - Fix cancellation from shell when security is enabled
  • IMPALA-3633 - cancel fragment if coordinator is gone
  • IMPALA-3646 - Handle corrupt RLE literal or repeat counts of 0.
  • IMPALA-3670 - fix sorter buffer mgmt bugs
  • IMPALA-3678 - Fix migration of predicates into union operands with an order by + limit.
  • IMPALA-3680 - Cleanup the scan range state after failed hdfs cache reads
  • IMPALA-3682 - Don't retry unrecoverable socket creation errors
  • IMPALA-3687 - Prefer Avro field name during schema reconciliation
  • IMPALA-3711 - Remove unnecessary privilege checks in getDbsMetadata()
  • IMPALA-3732 - handle string length overflow in avro files
  • IMPALA-3745 - parquet invalid data handling
  • IMPALA-3751 - fix clang build errors and warnings
  • IMPALA-3754 - fix TestParquet.test_corrupt_rle_counts flakiness
  • IMPALA-3776 - fix 'describe formatted' for Avro tables
  • IMPALA-3820 - Handle linkage errors while loading Java UDFs in Catalog
  • IMPALA-3861 - Replace BetweenPredicates with their equivalent CompoundPredicate.
  • IMPALA-3875 - Thrift threaded server hang in some cases
  • IMPALA-3884 - Support TYPE_TIMESTAMP for HashTableCtx::CodegenAssignNullValue()
  • IMPALA-3915 - Register privilege and audit requests when analyzing resolved table refs.
  • IMPALA-3930IMPALA-2570: Fix shuffle insert hint with constant partition exprs.
  • IMPALA-3940 - Fix getting column stats through views.
  • IMPALA-3949 - Log the error message in FileSystemUtil.copyToLocal()
  • IMPALA-3964 - Fix crash when a count(*) is performed on a nested collection.
  • IMPALA-3965 - not exported as part of impala-shell build lib
  • IMPALA-3983IMPALA-3974: Delete function jar resources after load
  • IMPALA-4019 - initialize member variables in HdfsTableSink
  • IMPALA-4020 - Handle external conflicting changes to HMS gracefully
  • IMPALA-4037IMPALA-4038: fix locking during query cancellation
  • IMPALA-4049 - fix empty batch handling NLJ build side
  • IMPALA-4076 - Fix runtime filter sort compare method
  • IMPALA-4099 - Fix the error message while loading UDFs with no JARs
  • IMPALA-4120 - Incorrect results with LEAD() analytic function
  • IMPALA-4135 - Thrift threaded server times-out connections during high load
  • IMPALA-4153 - Fix count(*) on all blank('') columns - test
  • IMPALA-4170 - Fix identifier quoting in COMPUTE INCREMENTAL STATS.
  • IMPALA-4180 - Synchronize accesses to RuntimeState::reader_contexts_
  • IMPALA-4196 - Cross compile bit-byte-functions
  • IMPALA-4223 - Handle truncated file read from HDFS cache
  • IMPALA-4237 - Fix materialization of 4 byte decimals in data source scan node.
  • IMPALA-4246 - SleepForMs() utility function has undefined behavior for > 1s
  • IMPALA-4260 - Alter table add column drops all the column stats
  • IMPALA-4263 - Fix wrong ommission of agg/analytic hash exchanges.
  • IMPALA-4266 - Java udf returning string can give incorrect results
  • IMPALA-4282 - Remove max length check for type strings.
  • IMPALA-4293 - query profile should include error log
  • IMPALA-4295 - XFAIL wildcard SSL test
  • IMPALA-4336 - Cast exprs after unnesting union operands.
  • IMPALA-4363 - Add Parquet timestamp validation
  • IMPALA-4383 - Ensure plan fragment report thread is always started
  • IMPALA-4391 - fix dropped statuses in scanners
  • IMPALA-4423 - Correct but conservative implementation of Subquery.equals().
  • IMPALA-4433 - Always generate testdata using the same time zone setting
  • IMPALA-4449 - Revisit table locking pattern in the catalog This commit fixes an issue where multiple long-running operations on the same catalog object (e.g. table) can block other catalog operations from making progress.
  • IMPALA-4488 - HS2 GetOperationStatus() should keep session alive
  • IMPALA-4518 - CopyStringVal() doesn't copy null string
  • IMPALA-4539 - fix bug when scratch batch references I/O buffers
  • IMPALA-4550 - Fix CastExpr analysis for substituted slots
  • IMPALA-4579 - SHOW CREATE VIEW fails for view containing a subquery
  • IMPALA-4765 - Avoid using several loading threads on one table.
  • IMPALA-4767 - Workaround for HIVE-15653 to preserve table stats.
  • IMPALA-4779IMPALA-4780: Fix conditional functions built-in and Timestamp bounds
  • IMPALA-4787 - Optimize APPX_MEDIAN() memory usage
  • IMPALA-4916 - Fix maintenance of set of item sets in DisjointSet.
  • IMPALA-4995 - Fix integer overflow in TopNNode::PrepareForOutput
  • IMPALA-4997 - Fix overflows in Sorter::TupleIterator
  • IMPALA-5005 - Don't allow server to send SASL COMPLETE msg out of order
  • IMPALA-5088 - Fix heap buffer overflow
  • IMPALA-5253 - Use appropriate transport for StatestoreSubscriber
  • IMPALA-4391 - fix dropped status in scanners
  • OOZIE-1814 - Oozie should mask any passwords in logs and REST interfaces
  • OOZIE-2068 - Configuration as part of sharelib
  • OOZIE-2194 - oozie job -kill doesn't work with spark action
  • OOZIE-2243 - Kill Command does not kill the child job for java action
  • OOZIE-2314 - Unable to kill old instance child job by workflow or coord rerun by Launcher
  • OOZIE-2329 - Make handling yarn restarts configurable
  • OOZIE-2345 - Parallel job submission for forked actions
  • OOZIE-2347 - Remove unnecessary new Configuration()/new jobConf() calls from oozie
  • OOZIE-2436 - Fork/join workflow fails with oozie.action.yarn.tag must not be null
  • OOZIE-2504 - Create a under HADOOP_CONF_DIR in Shell Action
  • OOZIE-2533 - Patch-1550 - workaround for
  • OOZIE-2555 - Oozie SSL enable setup does not return port for admin -servers
  • OOZIE-2567 - HCat connection is not closed while getting hcat cred
  • OOZIE-2584 - Eliminate Thread.sleep() calls in TestMemoryLocks
  • OOZIE-2589 - CompletedActionXCommand is hardcoded to wrong priority
  • OOZIE-2649 - Can't override sub-workflow configuration property if defined in parent workflow XML
  • OOZIE-2656 - OozieShareLibCLI uses op system username instead of Kerberos to upload jars
  • OOZIE-2678 - Oozie job -kill doesn't work with tez jobs
  • OOZIE-2739 - Remove property expansion pattern from ShellMain's log4j properties content
  • OOZIE-2742 - Unable to kill applications based on tag
  • OOZIE-2777 - Config-default.xml longer than 64k results in
  • OOZIE-2818 - Can't overwrite on a per-workflow basis
  • PIG-3807 - Pig creates wrong schema after dereferencing nested tuple fields with sorts
  • PIG-3818 - PIG-2499 is accidently reverted
  • PIG-3970 - Increase PermGen size, tests ran out of memory
  • PIG-4052 - TestJobControlSleep, TestInvokerSpeed are unreliable
  • SENTRY-1201 - Sentry ignores database prefix for MSCK statement
  • SENTRY-1265 - Sentry service should not require a TGT as it is not talking to other kerberos services as a client
  • SENTRY-1311 - Improve usability of URI privileges by supporting mixed use of URIs with and without scheme
  • SENTRY-1313 - Database prefix is not honoured when executing grant statement
  • SENTRY-1345 - ACLS on table folder disappear after insert for unpartitioned tables
  • SENTRY-1520 - Provide mechanism for triggering HMS full snapshot
  • SOLR-5776 - backportEnabled SSL tests can easily exhaust random generator entropy and block. Set the server side to SHA1PRNG as in Steve's original patch. Use less SSL in a test run. refactor SSLConfig so that SSLTestConfig can provide SSLContexts using a NullSecureRandom to prevent SSL tests from blocking on entropy starved machines Alternate (psuedo random) NullSecureRandom for Constants.SUN_OS replace NullSecureRandom w/ NotSecurePsuedoRandom
  • SOLR-6295 - Fix child filter query creation to never match parent docs in SolrExampleTests
  • SOLR-7280 - /Missing test resources
  • SOLR-7280 - BackportLoad cores in sorted order and tweak coreLoadThread counts to improve cluster stability on restarts
  • SOLR-7866 - Harden code to prevent an unhandled NPE when trying to determine the max value of the version field.
  • SOLR-9091 - ZkController#publishAndWaitForDownStates logic is inefficient
  • SOLR-9236 - AutoAddReplicas will append an extra /tlog to the update log location on replica failover.
  • SOLR-9284 - The HDFS BlockDirectoryCache should not let it's keysToRelease or names maps grow indefinitely.
  • SOLR-9310SOLR-9524
  • SOLR-9330 - Fix AlreadyClosedException on admin/mbeans?stats=true
  • SOLR-9699SOLR-4668: fix exception from core status in parallel with core reload
  • SOLR-9819 - Upgrade Apache commons-fileupload to 1.3.2, fixing a security vulnerability
  • SOLR-9848 - Lower back down from 7 seconds.
  • SOLR-9859 - backport cannot be updated after being written and neither or are durable in the face of a crash. Don't log error on NoSuchFileException
  • SOLR-9901 - backport of SOLR-9899 Implement move in HdfsDirectoryFactory. SOLR-9899: StandardDirectoryFactory should use optimizations for all FilterDirectorys not just NRTCachingDirectory.
  • SOLR-10031 - Validation of filename params in ReplicationHandler
  • SOLR-10114 - backport of SOLR-9941 - Reordered delete-by-query can delete or omit child documents
  • SOLR-10119 - TestReplicationHandler assertion fixes part of
  • SOLR-10121SOLR-10116: BlockCache corruption with high concurrency
  • SOLR-10338 - backportConfigure SecureRandom non blocking for tests.
  • SPARK-8428 - [SPARK-13850] Fix integer overflows in TimSort
  • SPARK-12009 - [YARN] Avoid to re-allocating yarn container while driver want to stop all Executors
  • SPARK-12241 - [YARN] Improve failure reporting in Yarn client obtainTokenForHBase()
  • SPARK-12339 - [SPARK-11206][WEBUI] Added a null check that was removed in
  • SPARK-12392 - [CORE] Optimize a location order of broadcast blocks by considering preferred local hosts
  • SPARK-12523 - [YARN] Support long-running of the Spark On HBase and hive meta store.
  • SPARK-12941 - [SQL][MASTER] Spark-SQL JDBC Oracle dialect fails to map string datatypes to Oracle VARCHAR datatype mapping
  • SPARK-12941 - [SQL][MASTER] Spark-SQL JDBC Oracle dialect fails to map string datatypes to Oracle VARCHAR datatype
  • SPARK-12966 - [SQL] ArrayType(DecimalType) support in Postgres JDBC
  • SPARK-13112 - [CORE] Make sure RegisterExecutorResponse arrive before LaunchTask
  • SPARK-13242 - [SQL] codegen fallback in case-when if there many branches
  • SPARK-13328 - [CORE] Poor read performance for broadcast variables with dynamic resource allocation
  • SPARK-13566 - [CORE] Avoid deadlock between BlockManager and Executor Thread
  • SPARK-13958 - Executor OOM due to unbounded growth of pointer array
  • SPARK-14204 - [SQL] register driverClass rather than user-specified class
  • SPARK-14391 - [LAUNCHER] Fix launcher communication test, take 2.
  • SPARK-14963 - [MINOR][YARN] Fix typo in YarnShuffleService recovery file name. Using recoveryPath if NM recovery is enabled
  • SPARK-15165 - [SPARK-15205] [SQL] Introduce place holder for comments in generated code
  • SPARK-16044 - [SQL] Backport input_file_name() for data source based on NewHadoopRDD to branch 1.6
  • SPARK-16106 - [CORE] TaskSchedulerImpl should properly track executors added to existing hosts
  • SPARK-16230 - [CORE] CoarseGrainedExecutorBackend to self kill if there is an exception while creating an Executor
  • SPARK-16505 - [YARN] Optionally propagate error during shuffle service startup.
  • SPARK-16625 - [SQL] General data types to be mapped to Oracle
  • SPARK-16711 - YarnShuffleService doesn't re-init properly on YARN rolling upgrade
  • SPARK-16873 - [CORE] Fix SpillReader NPE when spillFile has no data
  • SPARK-17171 - [WEB UI] DAG will list all partitions in the graph
  • SPARK-17245 - [SQL][BRANCH-1.6] Do not rely on Hive's session state to retrieve HiveConf
  • SPARK-17433 - YarnShuffleService doesn't handle moving credentials levelDb
  • SPARK-17465 - [SPARK CORE] Inappropriate memory management in `` may lead to memory leak
  • SPARK-17611 - [YARN][TEST] Make shuffle service test really test auth.
  • SPARK-17644 - [CORE] Do not add failedStages when abortStage for fetch failure
  • SPARK-17696 - [SPARK-12330][CORE] Partial backport of to branch-1.6.
  • SPARK-18750 - [YARN] Avoid using "mapValues" when allocating containers.
  • SPARK-19178 - [SQL][Backport-to-1.6] convert string of large numbers to int should return null
  • SPARK-19263 - DAGScheduler should avoid sending conflicting task set.
  • SPARK-19537 - Move pendingPartitions to ShuffleMapStage.
  • SQOOP-2349 - Add command line option for setting transaction isolation levels for metadata queries
  • SQOOP-2561 - Special Character removal from Column name as avro data results in duplicate column and fails the import
  • SQOOP-2846 - Sqoop Export with update-key failing for avro data file
  • SQOOP-2884 - Document --temporary-rootdir
  • SQOOP-2896 - Sqoop exec job fails with SQLException Access denied for user
  • SQOOP-2906 - Optimization of AvroUtil.toAvroIdentifier
  • SQOOP-2909 - Oracle related ImportTest fails after SQOOP-2737
  • SQOOP-2911 - Fix failing HCatalogExportTest caused by SQOOP-2863
  • SQOOP-2915 - Fixing Oracle related unit tests
  • SQOOP-2920 - sqoop performance deteriorates significantly on wide datasets; sqoop 100% on cpu
  • SQOOP-2950 - Sqoop trunk has consistent UT failures - need fixing
  • SQOOP-2952 - Fixing bug
  • SQOOP-2971 - OraOop does not close connections properly
  • SQOOP-2983 - OraOop export has degraded performance with wide tables
  • SQOOP-2986 - Add validation check for --hive-import and --incremental lastmodified
  • SQOOP-2990 - Sqoop(oracle) export [updateTableToOracle] with "--update-mode allowinsert" : app fails with java.sql.SQLException: Missing IN or OUT parameter at index
  • SQOOP-2995 - backward incompatibility introduced by Custom Tool options
  • SQOOP-2999 - Sqoop ClassNotFoundException (org.apache.commons.lang3.StringUtils) is thrown when executing Oracle direct import map task
  • SQOOP-3013 - Configuration "tmpjars" is not checked for empty strings before passing to MR
  • SQOOP-3021 - ClassWriter fails if a column name contains a backslash character
  • SQOOP-3028 - Include stack trace in the logging of exceptions in ExportTool
  • SQOOP-3034 - HBase import should fail fast if using anything other than as-textfile
  • SQOOP-3053 - Create a cmd line argument for sqoop.throwOnError and use it through SqoopOptions
  • SQOOP-3055 - Fixing MySQL tests failing due to ignored test inputs/configuration
  • SQOOP-3057 - Fixing 3rd party Oracle tests failing due to invalid case of column names
  • SQOOP-3066 - Introduce an option + env variable to enable/disable SQOOP-2737 feature
  • SQOOP-3068 - Enhance error (tool.ImportTool: Encountered IOException running import job: Expected schema) to suggest workaround
  • SQOOP-3069 - Get OracleExportTest#testUpsertTestExport in line with SQOOP-3066
  • SQOOP-3071 - Fix OracleManager to apply localTimeZone correctly in case of Date objects too
  • SQOOP-3072 - Reenable escaping in ImportTest#testProductWithWhiteSpaceImport for proper execution
  • SQOOP-3081 - use OracleEscapeUtils.escapeIdentifier in OracleUpsertOutputFormat instead of inline appending quotes
  • SQOOP-3123 - Introduce escaping logic for column mapping parameters (same what Sqoop already uses for the DB column names), thus special column names (e.g. containing '#' character) and mappings realted to those columns can be in the same format (thus not confusing the end users), and also eliminates the related AVRO format clashing issues.
  • SQOOP-3124 - Fix ordering in column list query of PostgreSQL connector to reflect the logical order instead of adhoc ordering
  • SQOOP-3159 - Sqoop (export + --table) with Oracle table_name having '$' fails with error
Selected tab: whatsnew

Want to Get Involved or Learn More?

Check out our other resources

Cloudera Community

Collaborate with your peers, industry experts, and Clouderans to make the most of your investment in Hadoop.

Cloudera Educational Services

Receive expert Hadoop training through Cloudera Educational Services, the industry's only truly dynamic Hadoop training curriculum that’s updated regularly to reflect the state of the art in big data.