Issues Fixed in CDH 5.3.x

Issues Fixed in CDH 5.3.10

CDH 5.3.10 fixes the following issues.

Apache Hadoop

FSImage may get corrupted after deleting snapshot

Bug: HDFS-9406

When deleting a snapshot that contains the last record of a given INode, the fsimage may become corrupt because the create list of the snapshot diff in the previous snapshot and the child list of the parent INodeDirectory are not cleaned.

Apache HBase

The ReplicationCleaner process can abort if its connection to ZooKeeper is inconsistent

Bug: HBASE-15234

If the connection with ZooKeeper is inconsistent, the ReplicationCleaner may abort, and the following event is logged by the HMaster:

WARN org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner: Aborting ReplicationLogCleaner
because Failed to get list of replicators

Unprocessed WALs accumulate.

The seekBefore() method calculates the size of the previous data block by assuming that data blocks are contiguous, and HFile v2 and higher store Bloom blocks and leaf-level INode blocks with the data. As a result, reverse scans do not work when Bloom blocks or leaf-level INode blocks are present when HFile v2 or higher is used.

Workaround: Restart the HMaster occasionally. The ReplicationCleaner restarts if necessary and process the unprocessed WALs.

Upstream Issues Fixed

The following upstream issues are fixed in CDH 5.3.10:

  • HADOOP-7713 - dfs -count -q should label output column
  • HADOOP-8944 - Shell command fs -count should include human readable option
  • HADOOP-10406 - TestIPC.testIpcWithReaderQueuing may fail
  • HADOOP-12200 - TestCryptoStreamsWithOpensslAesCtrCryptoCodec should be skipped in non-native profile
  • HADOOP-12240 - Fix tests requiring native library to be skipped in non-native profile
  • HADOOP-12280 - Skip unit tests based on maven profile rather than NativeCodeLoader.isNativeCodeLoaded
  • HADOOP-12418 - TestRPC.testRPCInterruptedSimple fails intermittently
  • HADOOP-12464 - Interrupted client may try to fail over and retry
  • HADOOP-12468 - Partial group resolution failure should not result in user lockout
  • HADOOP-12559 - KMS connection failures should trigger TGT renewal
  • HADOOP-12604 - Exception may be swallowed in KMSClientProvider
  • HADOOP-12605 - Fix intermittent failure of TestIPC.testIpcWithReaderQueuing
  • HADOOP-12682 - Fix TestKMS#testKMSRestart* failure
  • HADOOP-12699 - TestKMS#testKMSProvider intermittently fails during 'test rollover draining'
  • HADOOP-12715 - TestValueQueue#testgetAtMostPolicyALL fails intermittently
  • HADOOP-12736 - TestTimedOutTestsListener#testThreadDumpAndDeadlocks sometimes times out
  • HADOOP-12788 - OpensslAesCtrCryptoCodec should log which random number generator is used
  • HDFS-6533 - TestBPOfferService#testBasicFunctionalitytest fails intermittently
  • HDFS-6673 - Add delimited format support to PB OIV tool
  • HDFS-6799 - The invalidate method in SimulatedFSDataset failed to remove (invalidate) blocks from the file system
  • HDFS-7423 - Various typos and message formatting fixes in nfs daemon and doc
  • HDFS-7553 - Fix the TestDFSUpgradeWithHA due to BindException
  • HDFS-7990 - IBR delete ack should not be delayed
  • HDFS-8211 - DataNode UUID is always null in the JMX counter
  • HDFS-8646 - Prune cached replicas from DatanodeDescriptor state on replica invalidation
  • HDFS-9092 - NFS silently drops overlapping write requests and causes data copying to fail
  • HDFS-9250 - Add Precondition check to LocatedBlock#addCachedLoc
  • HDFS-9347 - Invariant assumption in TestQuorumJournalManager.shutdown() is wrong
  • HDFS-9358 - TestNodeCount#testNodeCount timed out
  • HDFS-9364 - Unnecessary DNS resolution attempts when creating NameNodeProxies
  • HDFS-9406 - FSImage may get corrupted after deleting snapshot
  • HDFS-9949 - Add a test case to ensure that the DataNode does not regenerate its UUID when a storage directory is cleared
  • MAPREDUCE-6302 - Incorrect headroom can lead to a deadlock between map and reduce allocations
  • MAPREDUCE-6387 - Serialize the recently added Task#encryptedSpillKey field at the end
  • MAPREDUCE-6460 - TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException fails
  • YARN-2377 - Localization exception stack traces are not passed as diagnostic info
  • YARN-2785 - Fixed intermittent TestContainerResourceUsage failure
  • YARN-3024 - LocalizerRunner should give DIE action when all resources are localized
  • YARN-3074 - Nodemanager dies when localizer runner tries to write to a full disk
  • YARN-3464 - Race condition in LocalizerRunner kills localizer before localizing all resources.
  • YARN-3516 - Killing ContainerLocalizer action does not take effect when private localizer receives FETCH_FAILURE status
  • YARN-3727 - For better error recovery, check if the directory exists before using it for localization
  • YARN-3762 - FairScheduler: CME on FSParentQueue#getQueueUserAclInfo
  • YARN-4204 - ConcurrentModificationException in FairSchedulerQueueInfo
  • YARN-4235 - FairScheduler PrimaryGroup does not handle empty groups returned for a user
  • YARN-4354 - Public resource localization fails with NPE
  • YARN-4380 - TestResourceLocalizationService.testDownloadingResourcesOnContainerKill fails intermittently
  • YARN-4393 - Fix intermittent test failure for TestResourceLocalizationService#testFailedDirsResourceRelease
  • YARN-4613 - Fix test failure in TestClientRMService#testGetClusterNodes
  • YARN-4717 - TestResourceLocalizationService.testPublicResourceInitializesLocalDir fails Intermittently due to IllegalArgumentException from cleanup
  • HBASE-10153 - Improve VerifyReplication to compute BADROWS more accurately
  • HBASE-11394 - AmendReplication can have data loss if peer id contains hyphen
  • HBASE-11394 - Replication can have data loss if peer id contains hyphen "-"
  • HBASE-11992 - Backport HBASE-11367 (Pluggable replication endpoint) to 0.98
  • HBASE-12136 - Race condition between client adding tableCF replication znode and server triggering TableCFsTracker
  • HBASE-12150 - Backport replication changes from HBASE-12145
  • HBASE-12336 - RegionServer failed to shutdown for NodeFailoverWorker thread
  • HBASE-12631 - Backport HBASE-12576 (Add metrics for rolling the HLog if there are too few DNs in the write pipeline) to 0.98
  • HBASE-12658 - Backport HBASE-12574 (Update replication metrics to not do so many map look ups) to 0.98
  • HBASE-12865 - WALs may be deleted before they are replicated to peers
  • HBASE-13035 - Backport HBASE-12867 Shell does not support custom replication endpoint specification
  • HBASE-13084 - Add labels to VisibilityLabelsCache asynchronously causes TestShell flakey
  • HBASE-13437 - ThriftServer leaks ZooKeeper connections
  • HBASE-13703 - ReplicateContext should not be a member of ReplicationSource
  • HBASE-13746 - list_replicated_tables command is not listing table in hbase shell
  • HBASE-14146 - Fix Once replication sees an error it slows down forever
  • HBASE-14501 - NPE in replication with TDE
  • HBASE-14621 - ReplicationLogCleaner gets stuck when a regionserver crashes
  • HBASE-14923 - VerifyReplication should not mask the exception during result comparison
  • HBASE-15019 - Replication stuck when HDFS is restarted
  • HBASE-15032 - hbase shell scan filter string assumes UTF-8 encoding
  • HBASE-15035 - bulkloading hfiles with tags that require splits do not preserve tags
  • HBASE-15052 - Use EnvironmentEdgeManager in ReplicationSource
  • HIVE-7524 - Enable auto conversion of SMBjoin in presence of constant propagate optimization
  • HIVE-7575 - GetTables thrift call is very slow
  • HIVE-8115 - Fixing text failures caused in CDH
  • HIVE-8115 - Hive select query hang when fields contain map
  • HIVE-8184 - Inconsistency between colList and columnExprMap when ConstantPropagate is applied to subquery
  • HIVE-9112 - Query may generate different results depending on the number of reducers
  • HIVE-9500 - Support nested structs over 24 levels
  • HIVE-9860 - MapredLocalTask/SecureCmdDoAs leaks local files
  • HIVE-10956 - Fallout fix from backport to CDH 5.3.x
  • HIVE-11977 - Hive should handle an external avro table with zero length files present
  • HIVE-12388 - GetTables cannot get external tables when TABLE type argument is given
  • HIVE-12406 - HIVE-9500 introduced incompatible change to LazySimpleSerDe public interface
  • HIVE-12713 - Miscellaneous improvements in driver compile and execute logging
  • HIVE-12790 - Metastore connection leaks in HiveServer2
  • HIVE-12946 - alter table should also add default scheme and authority for the location similar to create table
  • HUE-2767 - [impala] Issue showing sample data for a table
  • HUE-2941 - [hadoop] Cache the active RM HA
  • IMPALA-1702 - "invalidate metadata" can cause duplicate TableIds (issue not entirely fixed, but now fails gracefully)
  • IMPALA-2125 - Improve perf when reading timestamps from parquet files written by hive
  • IMPALA-2565 - Planner tests are flaky due to file size mismatches
  • IMPALA-3095 - Allow additional Kerberos users to be authorized to access internal APIs
  • OOZIE-2432 - TestPurgeXCommand fails
  • SENTRY-565 - Improve performance of filtering Hive SHOW commands
  • SENTRY-780 - HDFS Plugin should not execute path callbacks for views
  • SENTRY-835 - Drop table leaves a connection open when using metastorelistener
  • SENTRY-885 - DB name should be case insensitive in HDFS sync plugin.
  • SENTRY-936 - getGroup and getUser should always return orginal hdfs values for paths in prefix which are not sentry managed
  • SENTRY-944 - Setting HDFS rules on Sentry managed hdfs paths should not affect original hdfs rules
  • SENTRY-957 - Exceptions in MetastoreCacheInitializer should probably not prevent HMS from starting up
  • SENTRY-988 - It is better to let SentryAuthorization setter path always fall through and update HDFS
  • SENTRY-994 - SentryAuthorizationInfoX should override isSentryManaged
  • SENTRY-1002 - PathsUpdate.parsePath(path) will throw an NPE when parsing relative paths
  • SENTRY-1044 - Tables with non-hdfs locations break HMS startup
  • SPARK-12617 - [PYSPARK] Move Py4jCallbackConnectionCleaner to Streaming

Issues Fixed in CDH 5.3.9

Apache Commons Collections deserialization vulnerability

Cloudera has learned of a potential security vulnerability in a third-party library called the Apache Commons Collections. This library is used in products distributed and supported by Cloudera (“Cloudera Products”), including core Apache Hadoop. The Apache Commons Collections library is also in widespread use beyond the Hadoop ecosystem. At this time, no specific attack vector for this vulnerability has been identified as present in Cloudera Products.

In an abundance of caution, we are currently in the process of incorporating a version of the Apache Commons Collections library with a fix into the Cloudera Products. In most cases, this will require coordination with the projects in the Apache community. One example of this is tracked by HADOOP-12577.

The Apache Commons Collections potential security vulnerability is titled “Arbitrary remote code execution with InvokerTransformer” and is tracked by COLLECTIONS-580. MITRE has not issued a CVE, but related CVE-2015-4852 has been filed for the vulnerability. CERT has issued Vulnerability Note #576313 for this issue.

Releases affected: CDH 5.5.0, CDH 5.4.8 and lower, CDH 5.3.8 and lower, CDH 5.2.8 and lower, CDH 5.1.7 and lower, Cloudera Manager 5.5.0, Cloudera Manager 5.4.8 and lower, Cloudera Manager 5.3.8 and lower, and Cloudera Manager 5.2.8 and lower, Cloudera Manager 5.1.6 and lower, Cloudera Navigator 2.4.0, Cloudera Navigator 2.3.8 and lower.

Users affected: All

Impact: This potential vulnerability may enable an attacker to execute arbitrary code from a remote machine without requiring authentication.

Immediate action required: Upgrade to Cloudera Manager 5.5.1 and CDH 5.5.1, Cloudera Manager 5.4.9 and CDH 5.4.9, Cloudera Manager 5.3.9 and CDH 5.3.9, and Cloudera Manager 5.2.9 and CDH 5.2.9, and Cloudera Manager 5.1.7 and CDH 5.1.7.

Upstream Issues Fixed

The following upstream issues are fixed in CDH 5.3.9:

  • FLUME-2841 - Upgrade commons-collections to 3.2.2
  • HADOOP-12577 - Bumped up commons-collections version to 3.2.2 to address a security flaw
  • HDFS-7785 - Improve diagnostics information for HttpPutFailedException
  • HDFS-7798 - Checkpointing failure caused by shared KerberosAuthenticator
  • HDFS-7871 - NameNodeEditLogRoller can keep printing 'Swallowing exception' message
  • HDFS-9123 - Copying from the root to a subdirectory should be forbidden
  • HDFS-9273 - ACLs on root directory may be lost after NN restart
  • HDFS-9332 - Fix Precondition failures from NameNodeEditLogRoller while saving namespace
  • HDFS-9470 - Encryption zone on root not loaded from fsimage after NN restart
  • MAPREDUCE-6191 - Improve clearing stale state of Java serialization testcase
  • MAPREDUCE-6233 - org.apache.hadoop.mapreduce.TestLargeSort.testLargeSort failed in trunk
  • MAPREDUCE-6549 - Multibyte delimiters with LineRecordReader cause duplicate records
  • YARN-3564 - Fix TestContainerAllocation.testAMContainerAllocationWhenDNSUnavailable fails randomly
  • YARN-3602 - TestResourceLocalizationService.testPublicResourceInitializesLocalDir fails Intermittently due to IOException from cleanup
  • YARN-3675 - FairScheduler: RM quits when node removal races with continuous-scheduling on the same node
  • HBASE-13134 - mutateRow and checkAndMutate APIs do not throw region level exceptions
  • HBASE-14196 - Thrift server idle connection timeout issue
  • HBASE-14283 - Reverse scan does not work with HFile inline index/bloom blocks
  • HBASE-14533 - Thrift client gets "AsyncProcess: Failed to get region location .... closed"
  • HBASE-14799 - Commons-collections object deserialization remote command execution vulnerability
  • HIVE-6099 - Multi insert does not work properly with distinct count
  • HIVE-7146 - posexplode() UDTF fails with a NullPointerException on NULL columns
  • HIVE-8612 - Support metadata result filter hooks
  • HIVE-9475 - HiveMetastoreClient.tableExists does not work
  • HIVE-10895 - ObjectStore does not close Query objects in some calls, causing a potential leak in some metastore db resources
  • HIVE-11255 - get_table_objects_by_name() in HiveMetaStore.java needs to retrieve table objects in multiple batches
  • HIVE-12378 - Exception on HBaseSerDe.serialize binary field
  • HUE-3035 - [beeswax] Optimize sample data query for partitioned tables
  • IMPALA-1746 - QueryExecState does not check for query cancellation or errors
  • IMPALA-1756 - Constant filter expressions are not checked for errors and state cleanup not done before throwing exception
  • IMPALA-1917 - DCHECK on destroying an ExprContext
  • IMPALA-2141 - UnionNode::GetNext() does not check for query errors
  • IMPALA-2264 - Fix edge cases for decimal/integer cast
  • IMPALA-2514 - DCHECK on destroying an ExprContext
  • OOZIE-2413 - Kerberos credentials can expire if the KDC is slow to respond
  • PIG-3641 - Split "otherwise" producing incorrect output when combined with ColumnPruning
  • SPARK-11484 - [WEBUI] Using proxyBase set by spark instead of env
  • SPARK-11652 - [CORE] Remote code execution with InvokerTransformer

Issues Fixed in CDH 5.3.8

Upstream Issues Fixed

The following upstream issues are fixed in CDH 5.3.8:

  • CRUNCH-525 - Correct (more) accurate default scale factors for built-in MapFn implementations
  • CRUNCH-527 - Use hash smearing for partitioning
  • CRUNCH-528 - Improve Pair comparison
  • CRUNCH-535 - call initCredentials on the job
  • CRUNCH-536 - Refactor CrunchControlledJob.Hook interface and make it client-accessible
  • CRUNCH-539 - Fix reading WritableComparables bimap
  • CRUNCH-540 - Make AvroReflectDeepCopier serializable
  • CRUNCH-542 - Eliminate flaky Scrunch sampling test.
  • CRUNCH-543 - Have AvroPathPerKeyTarget handle child directories properly
  • CRUNCH-544 - Improve performance/serializability of materialized toMap.
  • CRUNCH-547 - Properly handle nullability for Avro union types
  • CRUNCH-548 - Have the AvroReflectDeepCopier use the class of the source object when constructing new instances instead of the target class
  • CRUNCH-551 - Make the use of Configuration objects consistent in CrunchInputSplit and CrunchRecordReader
  • CRUNCH-553 - Fix record drop issue that can occur w/From.formattedFile TableSources
  • FLUME-1934 - Spooling Directory Source dies on encountering zero-byte files.
  • FLUME-2095 - JMS source with TIBCO
  • FLUME-2385 - Remove incorrect log message at INFO level in Spool Directory Source.
  • FLUME-2753 - Error when specifying empty replace string in Search and Replace Interceptor
  • HADOOP-11105 - MetricsSystemImpl could leak memory in registered callbacks
  • HADOOP-11446 - S3AOutputStream should use shared thread pool to avoid OutOfMemoryError
  • HADOOP-11463 - Replace method-local TransferManager object with S3AFileSystem#transfers.
  • HADOOP-11584 - s3a file block size set to 0 in getFileStatus.
  • HADOOP-11607 - Reduce log spew in S3AFileSystem.
  • HADOOP-12317 - Applications fail on NM restart on some linux distro because NM container recovery declares AM container as LOST
  • HADOOP-12404 - Disable caching for JarURLConnection to avoid sharing JarFile with other users when loading resource from URL in Configuration class
  • HADOOP-12413 - AccessControlList should avoid calling getGroupNames in isUserInList with empty groups
  • HDFS-7978 - Add LOG.isDebugEnabled() guard for some LOG.debug(..)
  • HDFS-8384 - Allow NN to startup if there are files having a lease but are not under construction
  • HDFS-8964 - When validating the edit log, do not read at or beyond the file offset that is being written
  • HDFS-8965 - Harden edit log reading code against out of memory errors
  • MAPREDUCE-5918 - LineRecordReader can return the same decompressor to CodecPool multiple times
  • MAPREDUCE-5948 - org.apache.hadoop.mapred.LineRecordReader does not handle multibyte record delimiters well
  • MAPREDUCE-6277 - Job can post multiple history files if attempt loses connection to the RM
  • MAPREDUCE-6439 - AM may fail instead of retrying if RM shuts down during the allocate call.
  • MAPREDUCE-6481 - LineRecordReader may give incomplete record and wrong position/key information for uncompressed input sometimes
  • MAPREDUCE-6484 - Yarn Client uses local address instead of RM address as token renewer in a secure cluster when RM HA is enabled
  • YARN-3385 - Fixed a race-condition in ResourceManager's ZooKeeper based state-store to avoid crashing on duplicate deletes
  • YARN-3469 - ZKRMStateStore: Avoid setting watches that are not required.
  • YARN-3990 - AsyncDispatcher may overloaded with RMAppNodeUpdateEvent when Node is connected/disconnected
  • HBASE-12639 - Backport HBASE-12565 Race condition in HRegion.batchMutate() causes partial data to be written when region closes
  • HBASE-13217 - Procedure fails due to ZK issue
  • HBASE-13388 - Handling NullPointer in ZKProcedureMemberRpcs while getting ZNode data
  • HBASE-13437 - ThriftServer leaks ZooKeeper connections
  • HBASE-13471 - Fix a possible infinite loop in doMiniBatchMutation
  • HBASE-13684 - Allow mlockagent to be used when not starting as root
  • HBASE-13885 - ZK watches leaks during snapshots.
  • HBASE-14045 - Bumping thrift version to 0.9.2.
  • HBASE-14302 - TableSnapshotInputFormat should not create back references when restoring snapshot
  • HBASE-14354 - Minor improvements for usage of the mlock agent
  • HIVE-4867 - Deduplicate columns appearing in both the key list and value list of ReduceSinkOperator
  • HIVE-7012 - Wrong RS de-duplication in the ReduceSinkDeDuplication Optimizer
  • HIVE-8162 - Dynamic sort optimization propagates additional columns even in the absence of order by
  • HIVE-8398 - ExprNodeColumnDesc cannot be cast to ExprNodeConstantDesc
  • HIVE-8404 - ColumnPruner doesnt prune columns from limit operator
  • HIVE-8560 - SerDes that do not inherit AbstractSerDe do not get table properties during initialize()
  • HIVE-9195 - CBO changes constant to column type
  • HIVE-9450 - Merge[Parquet] Check all data types work for Parquet in Group
  • HIVE-9613 - Left join query plan outputs wrong column when using subquery
  • HIVE-9984 - JoinReorder's getOutputSize is exponential
  • HIVE-10319 - Hive CLI startup takes a long time with a large number of databases
  • HIVE-10572 - Improve Hive service test to check empty string
  • HIVE-11077 - part ofExchange partition does not properly populate fields for post/pre execute hooks.
  • HIVE-11172 - Retrofit Q-Test + Vectorization wrong results for aggregate query with where clause without group by
  • HIVE-11172 - Vectorization wrong results for aggregate query with where clause without group by
  • HIVE-11174 - Hive does not treat floating point signed zeros as equal (-0.0 should equal 0.0 according to IEEE floating point spec)
  • HIVE-11203 - Beeline force option does not force execution when errors occurred in a script.
  • HIVE-11216 - UDF GenericUDFMapKeys throws NPE when a null map value is passed in
  • HIVE-11271 - java.lang.IndexOutOfBoundsException when union all with if function
  • HIVE-11288 - Avro SerDe InstanceCache returns incorrect schema
  • HIVE-11333 - ColumnPruner prunes columns of UnionOperator that should be kept
  • HIVE-11590 - AvroDeserializer is very chatty
  • HIVE-11657 - HIVE-2573 introduces some issues during metastore init (and CLI init)
  • HIVE-11695 - If user have no permission to create LOCAL DIRECTORY ,the Hql does not throw any exception and fail silently.
  • HIVE-11696 - Exception when table-level serde is Parquet while partition-level serde is JSON
  • HIVE-11816 - Upgrade groovy to 2.4.4
  • HIVE-11824 - Insert to local directory causes staging directory to be copied
  • HIVE-11995 - Remove repetitively setting permissions in insert/load overwrite partition
  • HUE-2880 - [hadoop] Fix uploading large files to a kerberized HTTPFS
  • HUE-2893 - [desktop] Backport CherryPy SSL file upload fix
  • IMPALA-1929 - Avoiding a DCHECK of NULL hash table in spilled right joins
  • IMPALA-2133 - Properly unescape string value for HBase filters
  • IMPALA-2165 - Avoid cardinality 0 in scan nodes of small tables and low selectivity
  • IMPALA-2178 - fix Expr::ComputeResultsLayout() logic
  • IMPALA-2314 - LargestSpilledPartition was not checking if partition is closed
  • IMPALA-2364 - Wrong DCHECK in PHJ::ProcessProbeBatch
  • KITE-1053 - Fix int overflow bug in FS writer.
  • KITE-1074 - Partial updates aka Atomic updates with loadSolr aren't recognized with Solrcloud
  • MAHOUT-1771 - Cluster dumper omits indices and 0 elements for dense vector or sparse containing 0s, this closes apache/mahout#158
  • MAHOUT-1771 - Cluster dumper omits indices and 0 elements for dense vector or sparse containing 0s closes apache/mahout #158
  • PIG-4024 - TestPigStreamingUDF and TestPigStreaming fail on IBM JDK
  • PIG-4326 - AvroStorageSchemaConversionUtilities does not properly convert schema for maps of arrays of records
  • PIG-4338 - Fix test failures with JDK8
  • SENTRY-799 - unit test forFix TestDbEndToEnd flaky test - drop table/dbs before creating
  • SENTRY-878 - collect_list missing from HIVE_UDF_WHITE_LIST
  • SENTRY-893 - Synchronize calls in SentryClient and create sentry client once per request in SimpleDBProvider
  • SOLR-5496 - Ensure all http CMs get shutdown.
  • SOLR-7956 - There are interrupts on shutdown in places that can cause ChannelAlreadyClose
  • SOLR-7999 - SolrRequetParserTest#testStreamURL started failing.
  • SPARK-6480 - [CORE] histogram() bucket function is wrong in some simple edge cases
  • SPARK-6880 - [CORE]Fixed null check when all the dependent stages are cancelled due to previous stage failure
  • SPARK-8606 - Prevent exceptions in RDD.getPreferredLocations() from crashing DAGScheduler

Issues Fixed in CDH 5.3.6

Upstream Issues Fixed

The following upstream issues are fixed in CDH 5.3.6:

  • CRUNCH-516 - Scrunch needs some additional null checks
  • CRUNCH-508 - Improve performance of Scala Enumeration counters in Scrunch
  • CRUNCH-514 - AvroDerivedDeepCopier should initialize delegate MapFns
  • CRUNCH-530 - Fix object reuse bug in GenericRecordToTuple
  • HADOOP-12158 - Improve error message in TestCryptoStreamsWithOpensslAesCtrCryptoCodec when OpenSSL is not installed
  • HADOOP-11711 - Provide a default value for AES/CTR/NoPadding CryptoCodec classes
  • HADOOP-12103 - Small refactoring of DelegationTokenAuthenticationFilter to allow code sharing
  • HADOOP-8151 - Error handling in snappy decompressor throws invalid exceptions
  • HADOOP-11969 - ThreadLocal initialization in several classes is not thread safe
  • HDFS-7443 - Datanode upgrade to BLOCKID_BASED_LAYOUT fails if duplicate block files are present in the same volume
  • HDFS-8337 - Accessing httpfs via webhdfs doesn't work from a jar with kerberos
  • HDFS-7546 - Document, and set an accepting default for dfs.namenode.kerberos.principal.pattern
  • HDFS-8656 - Preserve compatibility of ClientProtocol#rollingUpgrade after finalization
  • HDFS-7894 - Rolling upgrade readiness is not updated in jmx until query command is issued.
  • HDFS-8127 - NameNode Failover during HA upgrade can cause DataNode to finalize upgrade
  • HDFS-3443 - Fix NPE when namenode transition to active during startup by adding checkNNStartup() in NameNodeRpcServer
  • YARN-3143 - RM Apps REST API can return NPE or entries missing id and other fields
  • HBASE-13995 - ServerName is not fully case insensitive
  • HBASE-13430 - HFiles that are in use by a table cloned from a snapshot may be deleted when that snapshot is deleted
  • HBASE-12539 - HFileLinkCleaner logs are uselessly noisy
  • HBASE-11898 - CoprocessorHost.Environment should cache class loader instance
  • HBASE-13826 - Unable to create table when group acls are appropriately set.
  • HBASE-13241 - Add tests for group level grants
  • HBASE-13239 - HBase grant at specific column level does not work for Groups
  • HBASE-13789 - ForeignException should not be sent to the client
  • HBASE-13779 - Calling table.exists() before table.get() end up with an empty Result
  • HBASE-13780 - Default to 700 for HDFS root dir permissions for secure deployments
  • HBASE-13768 - ZooKeeper znodes are bootstrapped with insecure ACLs in a secure configuration
  • HBASE-13767 - Allow ZKAclReset to set and not just clear ZK ACLs
  • HBASE-13086 - Show ZK root node on Master WebUI
  • HBASE-13342 - Fix incorrect interface annotations
  • HBASE-13162 - Add capability for cleaning hbase acls to hbase cleanup script.
  • HBASE-12641 - Grant all permissions of hbase zookeeper node to hbase superuser in a secure cluster
  • HBASE-12414 - Move HFileLink.exists() to base class
  • HIVE-11150 - Remove wrong warning message related to chgrp
  • HIVE-8318 - Null Scan optimizer throws exception when no partitions are selected
  • HIVE-7385 - Optimize for empty relation scans
  • HIVE-7299 - Enable metadata only optimization on Tez
  • HIVE-10808 - Inner join on Null throwing Cast Exception
  • HIVE-9087 - The move task does not handle properly in the case of loading data from the local file system path.
  • HIVE-9325 - Handle the case of insert overwrite statement with a qualified path that the destination path does not have a schema.
  • HIVE-9349 - Remove the schema in the getQualifiedPathWithoutSchemeAndAuthority method
  • HIVE-9328 - Tests cannot move files due to change on HIVE-9325
  • HIVE-6024 - Load data local inpath unnecessarily creates a copy task
  • HIVE-10841 - [WHERE col is not null] does not work sometimes for queries with many JOIN statements
  • HIVE-9620 - Cannot retrieve column statistics using HMS API if column name contains uppercase characters
  • HIVE-8863 - Cannot drop table with uppercase name after "compute statistics for columns"
  • HIVE-10629 - Dropping table in an encrypted zone does not drop warehouse directory
  • HIVE-10630 - Renaming tables across encryption zones renames table even though the operation throws error
  • HIVE-10956 - HS2 leaks HMS connections
  • HIVE-8298 - Incorrect results for n-way join when join expressions are not in same order across joins
  • HIVE-8895 - bugs in mergejoin
  • HIVE-10771 - "separatorChar" has no effect in "CREATE TABLE AS SELECT" statement
  • HIVE-6679 - HiveServer2 should support configurable the server side socket timeout and keepalive for various transports types where applicable
  • HIVE-10732 - Hive JDBC driver does not close operation for metadata queries
  • HIVE-7027 - Hive job fails when referencing a view that explodes an array
  • IMPALA-1774 - Allow querying Parquet tables with complex-typed columns as long as those columns are not selected
  • IMPALA-1919 - Avoid calling ProcessBatch with out_batch->AtCapacity in right joins
  • IMPALA-2002 - Provide way to cache ext data source classes
  • IMPALA-1726 - Move JNI / Thrift utilities to separate header
  • HUE-2813 - [hive] Report when Hue server is down when trying to execute a query
  • HUE-2243 - [metastore] Listing tables can be very slow
  • OOZIE-1944 - Recursive variable resolution broken when same parameter name in config-default and action conf
  • PIG-4053 - TestMRCompiler succeeded with sun jdk 1.6 while failed with sun jdk 1.7
  • SENTRY-721 - HDFS Cascading permissions not applied to child file ACLs if a direct grant exists
  • SENTRY-699 - Memory leak when running Sentry w/ HiveServer2
  • SOLR-6146 - Leak in CloudSolrServer causing "Too many open files"
  • SOLR-7503 - Recovery after ZK session expiration happens in a single thread for all cores in a node

Issues Fixed in CDH 5.3.5

Potential job failures during YARN rolling upgrades to CDH 5.3.4

Problem: A MapReduce security fix introduced a compatibility issue that results in job failures during YARN rolling upgrades from CDH 5.3.3 to CDH 5.3.4.

Release affected: CDH 5.3.4

Release containing the fix: CDH 5.3.5

Workarounds: You can use any one of the following workarounds for this issue:
  • Upgrade to CDH 5.3.5.
  • Restart any jobs that might have failed during the upgrade.
  • Explicitly set the version of MapReduce to be used so it is picked on a per-job basis.
    1. Update the YARN property, MR Application Classpath (mapreduce.application.classpath), either in Cloudera Manager or in the mapred-site.xml file. Remove all existing values and add a new entry: <parcel-path>/lib/hadoop-mapreduce/*, where <parcel-path> is the absolute path to the parcel installation. For example, the default installation path for the CDH 5.3.3 parcel would be: /opt/cloudera/parcels/CDH-5.3.3-1.cdh5.3.3.p0.5/lib/hadoop-mapreduce/*.
    2. Wait until jobs submitted with the above client configuration change have run to completion.
    3. Upgrade to CDH 5.3.4.
    4. Update the MR Application Classpath (mapreduce.application.classpath) property to point to the new CDH 5.3.4 parcel.

      Do not delete the old parcel until after all jobs submitted prior to the upgrade have finished running.

Upstream Issues Fixed

The following upstream issue has been fixed in CDH 5.3.5:

  • YARN-3811 - NodeManager restarts could lead to application failures

Issues Fixed in CDH 5.3.4

Upstream Issues Fixed

The following upstream issues are fixed in CDH 5.3.4:

  • HDFS-7980 - Incremental BlockReport will dramatically slow down the startup of a namenode
  • HDFS-8380 - Always call addStoredBlock on blocks which have been shifted from one storage to another
  • HDFS-7645 - Rolling upgrade is restoring blocks from trash multiple times
  • HDFS-7869 - Inconsistency in the return information while performing rolling upgrade
  • HDFS-7340 - make rollingUpgrade start/finalize idempotent
  • HDFS-7312 - Update DistCp v1 to optionally not use tmp location (branch-1 only)
  • HDFS-7530 - Allow renaming of encryption zone roots
  • HDFS-7587 - Edit log corruption can happen if append fails with a quota violation
  • YARN-3485 - FairScheduler headroom calculation doesn't consider maxResources for Fifo and FairShare policies
  • YARN-3491 - PublicLocalizer#addResource is too slow.
  • YARN-3021 - YARN's delegation-token handling disallows certain trust setups to operate properly over DistCp
  • YARN-3241 - FairScheduler handles "invalid" queue names inconsistently
  • YARN-3022 - Expose Container resource information from NodeManager for monitoring
  • YARN-2984 - Metrics for container's actual memory usage
  • YARN-3465 - Use LinkedHashMap to preserve order of resource requests
  • MAPREDUCE-6339 - Job history file is not flushed correctly because isTimerActive flag is not set true when flushTimerTask is scheduled.
  • MAPREDUCE-5710 - Backport MAPREDUCE-1305 to branch-1
  • MAPREDUCE-6238 - MR2 can't run local jobs with -libjars command options which is a regression from MR1
  • MAPREDUCE-6076 - Zero map split input length combine with none zero map split input length may cause MR1 job hung sometimes.
  • HBASE-13374 - Small scanners (with particular configurations) do not return all rows
  • HBASE-13269 - Limit result array preallocation to avoid OOME with large scan caching values
  • HBASE-13422 - remove use of StandardCharsets in 0.98
  • HBASE-13335 - Update ClientSmallScanner and ClientSmallReversedScanner
  • HBASE-13262 - ResultScanner doesn't return all rows in Scan
  • HIVE-10646 - ColumnValue does not handle NULL_TYPE
  • HIVE-10453 - HS2 leaking open file descriptors when using UDFs
  • HIVE-9655 - Dynamic partition table insertion error
  • HIVE-10452 - Followup fix for HIVE-10202 to restrict it it for script mode.
  • HIVE-10312 - SASL.QOP in JDBC URL is ignored for Delegation token Authentication
  • HIVE-10202 - Beeline outputs prompt+query on standard output when used in non-interactive mode
  • HIVE-10087 - Beeline's --silent option should suppress query from being echoed when running with -f option
  • HIVE-10085 - Lateral view on top of a view throws RuntimeException
  • HIVE-2828 - make timestamp accessible in the hbase KeyValue
  • HUE-2741 - [home] Hide the document move dialog
  • HUE-2732 - Hue isn't correctly doing add_column migrations with non-blank defaults
  • HUE-2513 - [fb] File list column sorting is broken
  • IMPALA-1519 - Fix wrapping of exprs via a TupleIsNullPredicate with analytics
  • IMPALA-1952 - Expand parsing of decimals to include scientific notation
  • IMPALA-1860 - INSERT/CTAS evaluates and applies constant predicates.
  • IMPALA-1900 - Assign predicates below analytic functions with a compatible partition by clause
  • IMPALA-1376 - Split up Planner into multiple classes.
  • IMPALA-1888 - FIRST_VALUE may produce incorrect results with preceding windows
  • IMPALA-1559 - FIRST_VALUE rewrite fn type might not match slot type
  • IMPALA-1808 - AnalyticEvalNode cannot handle partition/order by exprs with NaN
  • IMPALA-1562 - AnalyticEvalNode not properly handling nullable tuples
  • OOZIE-2063 - Cron syntax creates duplicate actions
  • OOZIE-2218 - META-INF directories in the war file have 777 permissions
  • OOZIE-1878 - Can't execute dryrun on the CLI
  • SENTRY-696 - Improve Metastoreplugin Cache Initialization time
  • SENTRY-703 - Calls to add_partition fail when passed a Partition object with a null location
  • SENTRY-408 - The URI permission should support more filesystem prefixes
  • SOLR-7478 - UpdateLog#close shutdown it's executor with interrupts before running close, preventing a clean close.
  • SOLR-7437 - Make HDFS transaction log replication factor configurable.
  • SOLR-7338 - A reloaded core will never register itself as active after a ZK session expiration
  • SOLR-7370 - FSHDFSUtils#recoverFileLease tries to recover the lease every one second after the first four second wait.
  • SPARK-6578 - Outbound channel in network library is not thread-safe, can lead to fetch failures
  • SQOOP-2343 - AsyncSqlRecordWriter stucks if any exception is thrown out in its close method
  • SQOOP-2286 - Ensure Sqoop generates valid avro column names
  • SQOOP-2283 - Support usage of --exec and --password-alias
  • SQOOP-2281 - Set overwrite on kite dataset
  • SQOOP-2282 - Add validation check for --hive-import and --append
  • SQOOP-2257 - Parquet target for imports with Hive overwrite option does not work
  • ZOOKEEPER-2146 - BinaryInputArchive readString should check length before allocating memory
  • ZOOKEEPER-2149 - Logging of client address when socket connection established

Published Known Issues Fixed

As a result of the above fixes, the following issues, previously published as Known Issues in CDH 5, are also fixed.

Executing oozie job -config properties file -dryrun fails because of a code defect in argument parsing

Bug: OOZIE-1878

Severity: Low

Workaround: None.

Issues Fixed in CDH 5.3.3

Upstream Issues Fixed

The following upstream issues are fixed in CDH 5.3.3:

  • HADOOP-11722 - Some Instances of Services using ZKDelegationTokenSecretManager go down when old token cannot be deleted
  • HADOOP-11469 - KMS should skip default.key.acl and whitelist.key.acl when loading key acl
  • HADOOP-11710 - Make CryptoOutputStream behave like DFSOutputStream wrt synchronization
  • HADOOP-11674 - oneByteBuf in CryptoInputStream and CryptoOutputStream should be non static
  • HADOOP-11445 - Bzip2Codec: Data block is skipped when position of newly created stream is equal to start of split
  • HADOOP-11620 - Add support for load balancing across a group of KMS for HA
  • HDFS-6830 - BlockInfo.addStorage fails when DN changes the storage for a block replica
  • HDFS-7961 - Trigger full block report after hot swapping disk
  • HDFS-7960 - The full block report should prune zombie storages even if they're not empty
  • HDFS-7575 - Upgrade should generate a unique storage ID for each volume
  • HDFS-7596 - NameNode should prune dead storages from storageMap
  • HDFS-7579 - Improve log reporting during block report rpc failure
  • HDFS-7208 - NN doesn't schedule replication when a DN storage fails
  • HDFS-6899 - Allow changing MiniDFSCluster volumes per DN and capacity per volume
  • HDFS-6878 - Change MiniDFSCluster to support StorageType configuration for individual directories
  • HDFS-6678 - MiniDFSCluster may still be partially running after initialization fails.
  • YARN-3351 - AppMaster tracking URL is broken in HA
  • YARN-3242 - Asynchrony in ZK-close can lead to ZKRMStateStore watcher receiving events for old client
  • YARN-2865 - Application recovery continuously fails with "Application with id already present. Cannot duplicate"
  • MAPREDUCE-6275 - Race condition in FileOutputCommitter v2 for user-specified task output subdirs
  • MAPREDUCE-4815 - Speed up FileOutputCommitter#commitJob for many output files
  • HBASE-13131 - ReplicationAdmin leaks connections if there's an error in the constructor
  • HIVE-10086 - Hive throws error when accessing Parquet file schema using field name match
  • HIVE-10098 - HS2 local task for map join fails in KMS encrypted cluster
  • HIVE-7426 - ClassCastException: ...IntWritable cannot be cast to ...Text involving ql.udf.generic.GenericUDFBasePad.evaluate
  • HIVE-7737 - Hive logs full exception for table not found
  • HIVE-9749 - ObjectStore schema verification logic is incorrect
  • HIVE-9788 - Make double quote optional in tsv/csv/dsv output
  • HIVE-9755 - Hive built-in "ngram" UDAF fails when a mapper has no matches.
  • HIVE-9770 - Beeline ignores --showHeader for non-tablular output formats i.e csv,tsv,dsv
  • HIVE-8688 - serialized plan OutputStream is not being closed
  • HIVE-9716 - Map job fails when table's LOCATION does not have scheme
  • HIVE-5857 - Reduce tasks do not work in uber mode in YARN
  • HIVE-8938 - Compiler should save the transform URI as input entity
  • HUE-2569 - [home] Delete project is broken
  • HUE-2529 - Increase the character limit of 'Name' Textfield in Useradmin Ldap Sync Groups
  • HUE-2506 - [search] Marker map does not display with HTML widget
  • HUE-1663 - [core] Option to either follow or not LDAP referrals for auth
  • HUE-2198 - [core] Reduce noise such as "handle_other(): Mutual authentication unavailable on 200 response"
  • SENTRY-683 - HDFS service client should ensure the kerberos ticket validity before new service connection
  • SENTRY-654 - Calls to append_partition fail when Sentry is enabled
  • SENTRY-664 - After Namenode is restarted, Path updates remain unsynched
  • SENTRY-665 - PathsUpdate.parsePath needs to handle special characters
  • SENTRY-652 - Sentry fails to parse spaces when HDFS ACL sync enabled
  • SOLR-7092 - Stop the HDFS lease recovery retries on HdfsTransactionLog on close and try to avoid lease recovery on closed files.
  • SOLR-7141 - RecoveryStrategy: Raise time that we wait for any updates from the leader before they saw the recovery state to have finished.
  • SOLR-7113 - Multiple calls to UpdateLog#init is not thread safe with respect to the HDFS FileSystem client object usage.
  • SOLR-7134 - Replication can still cause index corruption.
  • SQOOP-1764 - Numeric Overflow when getting extent map
  • IMPALA-1658 - Add compatibility flag for Hive-Parquet-Timestamps
  • IMPALA-1820 - Start with small pages for hash tables during repartitioning
  • IMPALA-1897 - Fixes for old hash join and agg
  • IMPALA-1894 - Fix old aggregation node hash table cleanup
  • IMPALA-1863 - Avoid deadlock across fragment instances
  • IMPALA-1915 - Fix query hang in BufferedBlockMgr:FindBlock()
  • IMPALA-1890 - Fixing a race between ~BufferedBlockMgr() and the WriteComplete() call
  • IMPALA-1738 - Use snprintf() instead of lexical_cast() in float-to-string casts
  • IMPALA-1865 - Fix partition spilling cleanup when new stream OOMs
  • IMPALA-1835 - Keep the fragment alive for TransmitData()
  • IMPALA-1805 - Impala's ACLs check do not consider all group ACLs, only checked first one.
  • IMPALA-1794 - Fix infinite loop opening or closing file with invalid metadata
  • IMPALA-1801 - external-data-source-executor leaking global jni refs
  • IMPALA-1712 - Unexpected remote bytes read counter was not being reset properly
  • IMPALA-1636 - Generalize index-based partition pruning to allow constant expressions

Published Known Issues Fixed

As a result of the above fixes, the following issues, previously published as Known Issues in CDH 5, are also fixed.

After upgrade from a release earlier than CDH 5.2.0, storage IDs may no longer be unique

As of CDH 5.2, each storage volume on a DataNode should have its own unique storageID, but in clusters upgraded from CDH 4, or CDH 5 releases earlier than CDH 5.2.0, each volume on a given DataNode shares the same storageID, because the HDFS upgrade does not properly update the IDs to reflect the new naming scheme. This causes problems with load balancing. The problem affects only clusters upgraded from CDH 5.1.x and earlier to CDH 5.2 or later. Clusters that are new as of CDH 5.2.0 or later do not have the problem.

Bug: HDFS-7575

Severity: Medium

Workaround: Upgrade to a later or patched version of CDH.

Issues Fixed in CDH 5.3.2

Upstream Issues Fixed

The following upstream issues are fixed in CDH 5.3.2:

  • AVRO-1630 - Creating Builder from instance loses data
  • AVRO-1628 - Add Schema.createUnion(Schema... type)
  • AVRO-1539 - Add FileSystem-based FsInput Constructor
  • AVRO-1623 - GenericData#validate() of enum: IndexOutOfBoundsException
  • AVRO-1614 - Always getting a value...
  • AVRO-1592 - Java keyword as an enum constant in Avro schema file causes deserialization to fail.
  • AVRO-1619 - Generate better JavaDoc
  • AVRO-1622 - Add missing license headers
  • AVRO-1604 - ReflectData.AllowNull fails to generate schemas when @Nullable is present.
  • AVRO-1407 - NettyTransceiver can cause a infinite loop when slow to connect
  • AVRO-834 - Data File corruption recovery tool
  • AVRO-1596 - Cannot read past corrupted block in Avro data file
  • HADOOP-11350 - The size of header buffer of HttpServer is too small when HTTPS is enabled
  • HDFS-7707 - Edit log corruption due to delayed block removal again
  • HDFS-7718 - Store KeyProvider in ClientContext to avoid leaking key provider threads when using FileContext
  • HDFS-6425 - Large postponedMisreplicatedBlocks has impact on blockReport latency
  • HDFS-7560 - ACLs removed by removeDefaultAcl() will be back after NameNode restart/failover
  • HDFS-7513 - HDFS inotify: add defaultBlockSize to CreateEvent
  • HDFS-7158 - Reduce the memory usage of WebImageViewer
  • HDFS-7497 - Inconsistent report of decommissioning DataNodes between dfsadmin and NameNode webui
  • HDFS-6917 - Add an hdfs debug command to validate blocks, call recoverlease, etc.
  • HDFS-6779 - Add missing version subcommand for hdfs
  • YARN-2697 - RMAuthenticationHandler is no longer useful
  • YARN-2656 - RM web services authentication filter should add support for proxy user
  • YARN-3082 - Non thread safe access to systemCredentials in NodeHeartbeatResponse processing
  • YARN-3079 - Scheduler should also update maximumAllocation when updateNodeResource.
  • YARN-2992 - ZKRMStateStore crashes due to session expiry
  • YARN-2675 - containersKilled metrics is not updated when the container is killed during localization
  • YARN-2715 - Proxy user is problem for RPC interface if yarn.resourcemanager.webapp.proxyuser is not set
  • MAPREDUCE-6198 - NPE from JobTracker#resolveAndAddToTopology in MR1 cause initJob and heartbeat failure.
  • MAPREDUCE-6196 - Fix BigDecimal ArithmeticException in PiEstimator
  • HBASE-12540 - TestRegionServerMetrics#testMobMetrics test failure
  • HBASE-12533 - staging directories are not deleted after secure bulk load
  • HBASE-12077 - FilterLists create many ArrayList$Itr objects per row.
  • HBASE-12386 - Replication gets stuck following a transient zookeeper error to remote peer cluster
  • HBASE-11979 - Compaction progress reporting is wrong
  • HBASE-12445 - hbase is removing all remaining cells immediately after the cell marked with marker = KeyValue.Type.DeleteColumn via PUT
  • HIVE-7647 - Beeline does not honor --headerInterval and --color when executing with "-e"
  • HIVE-7733 - Ambiguous column reference error on query
  • HIVE-9303 - Parquet files are written with incorrect definition levels
  • HIVE-8444 - update pom to junit 4.11
  • HIVE-9474 - truncate table changes permissions on the target
  • HIVE-9462 - HIVE-8577 - breaks type evolution
  • HIVE-9482 - Hive parquet timestamp compatibility
  • HIVE-6308 - COLUMNS_V2 Metastore table not populated for tables created without an explicit column list.
  • HIVE-9502 - Parquet cannot read Map types from files written with Hive 0.12 or earlier
  • HIVE-9445 - Revert HIVE-5700 - enforce single date format for partition column storage
  • HIVE-9393 - reduce noisy log level of ColumnarSerDe.java:116 from INFO to DEBUG
  • HIVE-7800 - Parquet Column Index Access Schema Size Checking
  • HIVE-9330 - DummyTxnManager will throw NPE if WriteEntity writeType has not been set
  • HIVE-9265 - Hive with encryption throws NPE to fs path without schema
  • HIVE-9199 - Excessive exclusive lock used in some DDLs with DummyTxnManager
  • HIVE-6978 - beeline always exits with 0 status, should exit with non-zero status on error
  • HUE-2556 - [core] Cannot update project tags of a document
  • HUE-2528 - Partitions limit gets capped to 1000 despite configuration
  • HUE-2548 - [metastore] Create table then load data does redirect to the table page
  • HUE-2525 - [core] Fix manual install of samples
  • HUE-2501 - [metastore] Creating a table with header files bigger than 64MB truncates it
  • HUE-2484 - [beeswax] Configure support for Hive Server2 LDAP authentication
  • HUE-2532 - [search] Fix share URL on Internet Explorer
  • HUE-2531 - [impala] Autogrow missing result list
  • HUE-2524 - [impala] Sort numerically recent queries tab
  • HUE-2495 - [oozie] Improve dashboards sorting mechanism
  • HUE-2511 - [impala] Infinite scroll keeps fetching results even if finished
  • HUE-2102 - [oozie] Workflow with credentials can't be used with Coordinator
  • HUE-2152 - [pig] Credentials support in editor
  • OOZIE-2131 - Add flag to sqoop action to skip hbase delegation token generation
  • OOZIE-2047 - Oozie does not support Hive tables that use datatypes introduced since Hive 0.8
  • OOZIE-2102 - Streaming actions are broken cause of incorrect method signature
  • PARQUET-173 - StatisticsFilter doesn't handle And properly
  • PARQUET-157 - Divide by zero in logging code
  • PARQUET-142 - parquet-tools doesn't filter _SUCCESS file
  • PARQUET-124 - parquet.hadoop.ParquetOutputCommitter.commitJob() throws parquet.io.ParquetEncodingException
  • PARQUET-136 - NPE thrown in StatisticsFilter when all values in a string/binary column trunk are null
  • PARQUET-168 - Wrong command line option description in parquet-tools
  • PARQUET-145 - InternalParquetRecordReader.close() should not throw an exception if initialization has failed
  • PARQUET-140 - Allow clients to control the GenericData object that is used to read Avro records
  • SOLR-7033 - [RecoveryStrategy should not publish any state when closed / cancelled.
  • SOLR-5961 - Solr gets crazy on /overseer/queue state change
  • SOLR-6640 - Replication can cause index corruption
  • SOLR-5875 - QueryComponent.mergeIds() unmarshals all docs' sort field values once per doc instead of once per shard
  • SOLR-6919 - Log REST info before executing
  • SOLR-6969 - When opening an HDFSTransactionLog for append we must first attempt to recover it's lease to prevent data loss.
  • SOLR-5515 - NPE when getting stats on date field with empty result on solrcloud
  • SPARK-3778 - newAPIHadoopRDD doesn't properly pass credentials for secure hdfs on yarn
  • SPARK-4835 - Streaming saveAs*HadoopFiles() methods may throw FileAlreadyExistsException during checkpoint recovery
  • SQOOP-2057 - Skip delegation token generation flag during hbase import
  • SQOOP-1779 - Add support for --hive-database when importing Parquet files into Hive
  • IMPALA-1622 - Fix overflow in StringParser::StringToFloatInternal()
  • IMPALA-1614 - Compute stats fails if table name starts with number
  • IMPALA-1623 - unix_timestamp() does not return correct time
  • IMPALA-1535 - Partition pruning with NULL
  • IMPALA-1606 - Impala does not always give short name to Llama
  • IMPALA-1120 - Fetch column statistics using Hive 0.13 bulk API

In addition, CDH 5.3.2 reverts YARN-2713, which has caused problems since its inclusion in CDH 5.3.0.

Published Known Issues Fixed

As a result of the above fixes, the following issues, previously published as Known Issues in CDH 5, are also fixed.

Hive does not support Parquet schema evolution

Adding a new column to a Parquet table causes queries on that table to fail with a column not found error.

Bug: HIVE-7800

Severity: Medium

Workaround: Use Impala instead; Impala handles Parquet schema evolution correctly.

Issues Fixed in CDH 5.3.1

Upstream Issues Fixed

The following upstream issues are fixed in CDH 5.3.1:

  • YARN-2975 - FSLeafQueue app lists are accessed without required locks
  • YARN-2010 - Handle app-recovery failures gracefully
  • YARN-3027 - Scheduler should use totalAvailable resource from node instead of availableResource for maxAllocation
  • HIVE-9445 - Revert HIVE-5700 - enforce single date format for partition column storage
  • IMPALA-1668 - TSaslServerTransport::Factory::getTransport() leaks transport map entries
  • IMPALA-1674 - IMPALA-1556 causes memory leak with secure connections

Published Known Issues Fixed

As a result of the above fixes, the following issues, previously published as Known Issues in CDH 5, are also fixed.

Upgrading a PostgreSQL Hive Metastore from Hive 0.12 to Hive 0.13 may result in a corrupt metastore

HIVE-5700 introduced a serious bug into the Hive Metastore upgrade scripts. This bug affects users who have a PostgreSQL Hive Metastore and have at least one table which is partitioned by date and the value is stored as a date type (not string).

Bug: HIVE-5700

Severity: High

Workaround: None. Do not upgrade your PostgreSQL metastore to version 0.13 if you satisfy the condition stated above.

Issues Fixed in CDH 5.3.0

The following topics describe known issues fixed in CDH 5.3.0.

Apache Hadoop

HDFS

Kerberos re-login attempts fail when using JDK 1.7.0_80

On clusters using JDK 1.7.0_80, long running HDFS clients are unable to re-authenticate using Kerberos once their ticket expires. Due to this authentication failure, any jobs triggered from these clients will fail.

Releases Affected: CDH 5.1, 5.2

Bug: HADOOP-10786

Workaround: Upgrade to CDH 5.3.2 (or higher).

NameNode - KMS communication fails after long periods of inactivity

Encrypted files and encryption zones cannot be created if a long period of time (by default, 20 hours) has passed since the last time the KMS and NameNode communicated.

Bug: HADOOP-11187

Workaround: There are two possible workarounds to this issue:
  • You can increase the KMS authentication token validity period to a very high number. Since the default value is 10 hours, this bug will only be encountered after 20 hours of no communication between the NameNode and the KMS. Add the following property to the kms-site.xml Safety Valve:
    <property>
    <name>hadoop.kms.authentication.token.validity</name>
    <value>SOME VERY HIGH NUMBER</value>
    </property>
  • You can switch the KMS signature secret provider to the string secret provider by adding the following property to the kms-site.xml Safety Valve:
    <property>
    <name>hadoop.kms.authentication.signature.secret</name>
    <value>SOME VERY SECRET STRING</value>
    </property> 
DataNodes may become unresponsive to block creation requests

In releases earlier than CDH 5.2.3, DataNodes may become unresponsive to block creation requests from clients when the directory scanner is running.

Bug: HDFS-7489

Workaround: Upgrade to CDH 5.2.3 or later.

Apache Hive

UDF translate() does not accept arguments of type CHAR or VARCHAR

Bug: HIVE-6622

Workaround: Cast the argument to type String.

Hive's Timestamp type cannot be stored in Parquet

Tables containing timestamp columns cannot use Parquet as the storage engine.

Bug: HIVE-6394

Workaround: Use a different file format.

Apache Spark

Spark sort-based shuffle is affected by a kernel bug
Spark sort-based shuffle is affected by a kernel bug (http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=2cb4b05e7647891b46b91c07c9a60304803d1688). The kernel bug was fixed in RHEL/CentOS 6.2.

Bug:SPARK-3948