Issues Fixed in CDH 5.5.x

Issues Fixed in CDH 5.5.6

Upstream Issues Fixed

The following upstream issues are fixed in CDH 5.5.6:

  • FLUME-2797 - Use SourceCounter for SyslogTcpSource
  • FLUME-2844 - SpillableMemoryChannel must start ChannelCounter
  • HADOOP-10300 - Allow deferred sending of call responses
  • HADOOP-11031 - Design Document for Credential Provider API
  • HADOOP-12453 - Support decoding KMS Delegation Token with its own Identifier
  • HADOOP-12483 - Maintain wrapped SASL ordering for postponed IPC responses
  • HADOOP-12537 - S3A to support Amazon STS temporary credentials
  • HADOOP-12548 - Read s3a credentials from a Credential Provider
  • HADOOP-12609 - Fix intermittent failure of TestDecayRpcScheduler.
  • HADOOP-12723 - S3A: Add ability to plug in any AWSCredentialsProvider
  • HADOOP-12749 - Create a threadpoolexecutor that overrides afterExecute to log uncaught exceptions/errors
  • HADOOP-13034 - Log message about input options in distcp lacks some items
  • HADOOP-13317 - Add logs to KMS server-side to improve supportability
  • HADOOP-13353 - LdapGroupsMapping getPassword should not return null when IOException throws
  • HADOOP-13487 - Hadoop KMS should load old delegation tokens from ZooKeeper on startup
  • HADOOP-13526 - Add detailed logging in KMS for the authentication failure of proxy user
  • HADOOP-13558 - UserGroupInformation created from a Subject incorrectly tries to renew the Kerberos ticket
  • HADOOP-13638 - KMS should set UGI's Configuration object properly
  • HADOOP-13669 - KMS Server should log exceptions before throwing
  • HADOOP-13693 - Remove the message about HTTP OPTIONS in SPNEGO initialization message from kms audit log.
  • HDFS-4176 - EditLogTailer should call rollEdits with a timeout.
  • HDFS-6962 - ACLs inheritance conflict with umaskmode
  • HDFS-7210 - Avoid two separate RPC's namenode.append() and namenode.getFileInfo() for an append call from DFSClient
  • HDFS-7413 - Some unit tests should use NameNodeProtocols instead of FSNameSystem
  • HDFS-7415 - Move FSNameSystem.resolvePath() to FSDirectory
  • HDFS-7420 - Delegate permission checks to FSDirectory
  • HDFS-7463 - Simplify FSNamesystem#getBlockLocationsUpdateTimes
  • HDFS-7478 - Move org.apache.hadoop.hdfs.server.namenode.NNConf to FSNamesystem
  • HDFS-7517 - Remove redundant non-null checks in FSNamesystem#getBlockLocations
  • HDFS-7964 - Add support for async edit logging
  • HDFS-8224 - Schedule a block for scanning if its metadata file is corrupt
  • HDFS-8269 - getBlockLocations() does not resolve the .reserved path and generates incorrect edit logs when updating the atime
  • HDFS-8709 - Clarify automatic sync in FSEditLog#logEdit
  • HDFS-9106 - Transfer failure during pipeline recovery causes permanent write failures
  • HDFS-9290 - DFSClient#callAppend() is not backward compatible for slightly older NameNodes
  • HDFS-9428 - Fix intermittent failure of TestDNFencing.testQueueingWithAppend
  • HDFS-9549 - TestCacheDirectives#testExceedsCapacity is unreliable
  • HDFS-9630 - DistCp minor refactoring and clean up
  • HDFS-9638 - Improve DistCp Help and documentation
  • HDFS-9764 - DistCp does not print value for several arguments including -numListstatusThreads.
  • HDFS-9781 - FsDatasetImpl#getBlockReports can occasionally throw NullPointerException
  • HDFS-9820 - Improve distcp to support efficient restore to an earlier snapshot
  • HDFS-9906 - Remove excessive unnecessary log output when a DataNode is restarted
  • HDFS-9958 - BlockManager#createLocatedBlocks can throw NPE for corruptBlocks on failed storages
  • HDFS-10178 - Permanent write failures can happen if pipeline recoveries occur for the first packet
  • HDFS-10216 - Distcp -diff throws exception when handling relative path
  • HDFS-10270 - TestJMXGet:testNameNode() fails
  • HDFS-10271 - Extra bytes are getting released from reservedSpace for append
  • HDFS-10298 - Document the usage of distcp -diff option
  • HDFS-10313 - Distcp need to enforce the order of snapshot names passed to -diff
  • HDFS-10397 - Distcp should ignore -delete option if -diff option is provided instead of exiting
  • HDFS-10457 - DataNode should not auto-format block pool directory if VERSION is missing.
  • HDFS-10525 - Fix NPE in CacheReplicationMonitor#rescanCachedBlockMap
  • HDFS-10556 - DistCpOptions should be validated automatically
  • HDFS-10609 - Uncaught InvalidEncryptionKeyException during pipeline recovery can abort downstream applications
  • HDFS-10722 - Fix race condition in TestEditLog#testBatchedSyncWithClosedLogs
  • HDFS-10760 - DataXceiver#run() should not log InvalidToken exception as an error
  • HDFS-10822 - Log DataNodes in the write pipeline. John Zhuge via Lei Xu
  • HDFS-10879 - TestEncryptionZonesWithKMS#testReadWrite fails intermittently
  • HDFS-10962 - TestRequestHedgingProxyProvider is unreliable
  • HDFS-11012 - Unnecessary INFO logging on DFSClients for InvalidToken
  • HDFS-11040 - Add documentation for HDFS-9820 distcp improvement
  • HDFS-11056 - Concurrent append and read operations lead to checksum error
  • MAPREDUCE-6359 - In RM HA setup, Cluster tab links populated with AM hostname instead of RM
  • MAPREDUCE-6473 - Job submission can take a long time during Cluster initialization
  • MAPREDUCE-6635 - Unsafe long to int conversion in UncompressedSplitLineReader and IndexOutOfBoundsException
  • MAPREDUCE-6680 - JHS UserLogDir scan algorithm sometime could skip directory with update in CloudFS (Azure FileSystem, S3, etc
  • MAPREDUCE-6684 - High contention on scanning of user directory under immediate_done in Job History Server
  • MAPREDUCE-6761 - Regression when handling providers - invalid configuration ServiceConfiguration causes Cluster initialization failure
  • MAPREDUCE-6771 - RMContainerAllocator sends container diagnostics event after corresponding completion event
  • YARN-3495 - Confusing log generated by FairScheduler
  • YARN-4004 - container-executor should print output of docker logs if the docker container exits with non-0 exit status
  • YARN-4017 - container-executor overuses PATH_MAX
  • YARN-4245 - Generalize config file handling in container-executor
  • YARN-4255 - container-executor does not clean up Docker operation command files
  • YARN-5608 - TestAMRMClient.setup() fails with ArrayOutOfBoundsException
  • YARN-5704 - Provide config knobs to control enabling/disabling new/work in progress features in container-executor
  • HBASE-13330 - Region left unassigned due to AM & SSH each thinking the assignment would be done by the other
  • HBASE-14241 - Fix deadlock during cluster shutdown due to concurrent connection close
  • HBASE-14313 - After a Connection sees ConnectionClosingException on a connection it never recovers
  • HBASE-14407 - NotServingRegion: hbase region closed forever
  • HBASE-14449 - Rewrite deadlock prevention for concurrent connection close
  • HBASE-14474 - Addendum closes connection in writeRequest() outside synchronized block
  • HBASE-14474 - DeadLock in RpcClientImpl.Connection.close()
  • HBASE-14578 - URISyntaxException during snapshot restore for table with user defined namespace
  • HBASE-14968 - ConcurrentModificationException in region close resulting in the region staying in closing state
  • HBASE-15430 - Failed taking snapshot - Manifest proto-message too large
  • HBASE-15856 - Addendum Fix UnknownHostException import in MetaTableLocator
  • HBASE-15856 - Do not cache unresolved addresses for connections
  • HBASE-16350 - Undo server abort from HBASE-14968
  • HBASE-16360 - TableMapReduceUtil addHBaseDependencyJars has the wrong class name for PrefixTreeCodec
  • HBASE-16767 - Mob compaction needs to clean up files in /hbase/mobdir/.tmp and /hbase/mobdir/.tmp/.bulkload when running into IO exceptions
  • HIVE-10384 - BackportRetryingMetaStoreClient does not retry wrapped TTransportExceptions
  • HIVE-10728 - Deprecate unix_timestamp(void) and make it deterministic
  • HIVE-11579 - Invoke the set command will close standard error output[beeline-cli]
  • HIVE-11768 - java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances
  • HIVE-11901 - StorageBasedAuthorizationProvider requires write permission on table for SELECT statements
  • HIVE-12475 - Parquet schema evolution within array<struct<>> does not work
  • HIVE-12891 - Hive fails when java.io.tmpdir is set to a relative location
  • HIVE-13058 - Add session and operation_log directory deletion messages
  • HIVE-13090 - Hive metastore crashes on NPE with ZooKeeperTokenStore
  • HIVE-13129 - CliService leaks HMS connection
  • HIVE-13198 - Authorization issues with cascading views
  • HIVE-13237 - Select parquet struct field with upper case throws NPE
  • HIVE-13429 - Tool to remove dangling scratch directory
  • HIVE-13997 - Insert overwrite directory does not overwrite existing files
  • HIVE-14296 - Session count is not decremented when HS2 clients do not shutdown cleanly
  • HIVE-14421 - FS.deleteOnExit holds references to _tmp_space.db files
  • HIVE-14436 - Hive 1.2.1/Hitting "ql.Driver: FAILED: IllegalArgumentException Error: , expected at the end of 'decimal(9'" after enabling hive.optimize.skewjoin and with MR engine
  • HIVE-14457 - Partitions in encryption zone are still trashed though an exception is returned
  • HIVE-14743 - ArrayIndexOutOfBoundsException - HBASE-backed views' query with JOINs
  • HIVE-14762 - Add logging while removing scratch space
  • HIVE-14805 - Subquery inside a view will have the object in the subquery as the direct input
  • HIVE-14817 - Shut down the SessionManager timeoutChecker thread properly upon shutdown
  • HIVE-15090 - Temporary DB failure can stop ExpiredTokenRemover thread
  • HUE-4804 - [search] Download function of HTML widget breaks the display
  • HUE-4968 - [oozie] Remove access to /oozie/import_wokflow when v2 is enabled
  • IMPALA-1928 - Fix Thrift client transport wrapping order
  • IMPALA-3369 - Add ALTER TABLE SET COLUMN STATS statement
  • IMPALA-3378 - HiveUdfCall::Open() produces unsynchronized access to JniUtil::global_refs_ vector
  • IMPALA-3379 - HBaseTableWriter::CreatePutList() produces unsynchronized access to JniUtil::global_refs_ vector
  • IMPALA-3441 - Check for malformed Avro data
  • IMPALA-3499 - Split catalog update
  • IMPALA-3575 - Add retry to backend connection request and rpc timeout
  • IMPALA-3633 - Cancel fragment if coordinator is gone
  • IMPALA-3682 - Do not retry unrecoverable socket creation errors
  • IMPALA-3687 - Prefer Avro field name during schema reconciliation
  • IMPALA-3698 - Fix Isilon permissions test
  • IMPALA-3711 - Remove unnecessary privilege checks in getDbsMetadata()
  • IMPALA-3732 - Handle string length overflow in Avro files
  • IMPALA-3751 - Fix clang build errors and warnings
  • IMPALA-3915 - Register privilege and audit requests when analyzing resolved table refs.
  • IMPALA-4135 - Thrift threaded server times-out connections during high load
  • IMPALA-4153 - Return valid non-NULL pointer for 0-byte allocations
  • OOZIE-1814 - Oozie should mask any passwords in logs and REST interfaces
  • OOZIE-2068 - Configuration as part of sharelib
  • OOZIE-2347 - Remove unnecessary new Configuration()/new jobConf() calls from Oozie
  • OOZIE-2555 - Oozie SSL enable setup does not return port for admin -servers
  • OOZIE-2567 - HCat connection is not closed while getting hcat cred
  • OOZIE-2589 - CompletedActionXCommand is hardcoded to wrong priority
  • OOZIE-2649 - Cannot override sub-workflow configuration property if defined in parent workflow XML
  • PIG-3569 - SUM function for BigDecimal and BigInteger
  • PIG-3818 - PIG-2499 is accidentally reverted
  • SENTRY-1095 - Insert into requires URI privilege on partition location under table.
  • SOLR-7280 - BackportLoad cores in sorted order and tweak coreLoadThread counts to improve cluster stability on restarts
  • SOLR-8407 - Backport
  • SOLR-8586 - Add index fingerprinting and use it in peersync
  • SOLR-8690 - Add solr.disableFingerprint system property
  • SOLR-8691 - Cache index fingerprints per searcher
  • SOLR-9310 - SOLR-9524
  • SPARK-17644 - [CORE] Do not add failedStages when abortStage for fetch failure
  • SQOOP-2387 - Sqoop should support importing from table with column names containing some special character
  • SQOOP-2864 - ClassWriter chokes on column names containing double quotes
  • SQOOP-2880 - Provide argument for overriding temporary directory
  • SQOOP-2884 - Document --temporary-rootdir
  • SQOOP-2906 - Optimization of AvroUtil.toAvroIdentifier
  • SQOOP-2915 - Fixing Oracle related unit tests
  • SQOOP-2920 - Sqoop performance deteriorates significantly on wide datasets; Sqoop 100% on CPU
  • SQOOP-2952 - Fixing bug row key not added into column family using --hbase-bulkload
  • SQOOP-2971 - OraOop does not close connections properly
  • SQOOP-2983 - OraOop export has degraded performance with wide tables
  • SQOOP-2986 - Add validation check for --hive-import and --incremental lastmodified
  • SQOOP-3021 - ClassWriter fails if a column name contains a backslash character
  • SQOOP-3034 - HBase import should fail fast if using anything other than as-textfile

Issues Fixed in CDH 5.5.5

Upstream Issues Fixed

The following upstream issues are fixed in CDH 5.5.5:

  • FLUME-2821 - KafkaSourceUtil Can Log Passwords at Info remove logging of security related data in older releases
  • FLUME-2913 - Don't strip SLF4J from imported classpaths
  • FLUME-2918 - Speed up TaildirSource on directories with many files
  • HADOOP-8436 - NPE In getLocalPathForWrite ( path, conf ) when the required context item is not configured
  • HADOOP-8437 - getLocalPathForWrite should throw IOException for invalid paths
  • HADOOP-8751 - NPE in Token.toString() when Token is constructed using null identifier
  • HADOOP-8934 - Shell command ls should include sort options
  • HADOOP-10048 - LocalDirAllocator should avoid holding locks while accessing the filesystem
  • HADOOP-10971 - Add -C flag to make `hadoop fs -ls` print filenames only
  • HADOOP-11901 - BytesWritable fails to support 2G chunks due to integer overflow
  • HADOOP-11984 - Enable parallel JUnit tests in pre-commit
  • HADOOP-12252 - LocalDirAllocator should not throw NPE with empty string configuration
  • HADOOP-12259 - Utility to Dynamic port allocation
  • HADOOP-12659 - Incorrect usage of config parameters in token manager of KMS
  • HADOOP-12787 - KMS SPNEGO sequence does not work with WEBHDFS
  • HADOOP-12841 - Update s3-related properties in core-default.xml.
  • HADOOP-12901 - Add warning log when KMSClientProvider cannot create a connection to the KMS server.
  • HADOOP-12963 - Allow using path style addressing for accessing the s3 endpoint.
  • HADOOP-13079 - Add -q option to Ls to print ? instead of non-printable characters
  • HADOOP-13132 - Handle ClassCastException on AuthenticationException in LoadBalancingKMSClientProvider
  • HADOOP-13155 - Implement TokenRenewer to renew and cancel delegation tokens in KMS
  • HADOOP-13251 - Authenticate with Kerberos credentials when renewing KMS delegation token
  • HADOOP-13255 - KMSClientProvider should check and renew tgt when doing delegation token operations
  • HADOOP-13263 - Reload cached groups in background after expiry.
  • HADOOP-13457 - Remove hardcoded absolute path for shell executable.
  • HDFS-6434 - Default permission for creating file should be 644 for WebHdfs/HttpFS
  • HDFS-7597 - DelegationTokenIdentifier should cache the TokenIdentifier to UGI mapping
  • HDFS-8008 - Support client-side back off when the datanodes are congested
  • HDFS-8581 - ContentSummary on / skips further counts on yielding lock
  • HDFS-8829 - Make SO_RCVBUF and SO_SNDBUF size configurable for DataTransferProtocol sockets and allow configuring auto-tuning
  • HDFS-8897 - Balancer should handle fs.defaultFS trailing slash in HA
  • HDFS-9085 - Show renewer information in DelegationTokenIdentifier#toString
  • HDFS-9259 - Make SO_SNDBUF size configurable at DFSClient side for hdfs write scenario.
  • HDFS-9276 - Failed to Update HDFS Delegation Token for long running application in HA mode
  • HDFS-9365 - Balaner does not work with the HDFS-6376 HA setup.
  • HDFS-9405 - Warmup NameNode EDEK caches in background thread
  • HDFS-9466 - TestShortCircuitCache#testDataXceiverCleansUpSlotsOnFailure is flaky
  • HDFS-9700 - DFSClient and DFSOutputStream should set TCP_NODELAY on sockets for DataTransferProtocol
  • HDFS-9732 - , Improve DelegationTokenIdentifier.toString() for better logging
  • HDFS-9805 - Add server-side configuration for enabling TCP_NODELAY for DataTransferProtocol and default it to true
  • HDFS-9939 - Increase DecompressorStream skip buffer size
  • HDFS-10360 - DataNode may format directory and lose blocks if current/VERSION is missing.
  • HDFS-10381 - DataStreamer DataNode exclusion log message should be warning.
  • HDFS-10396 - Using -diff option with DistCp may get "Comparison method violates its general contract" exception
  • HDFS-10481 - HTTPFS server should correctly impersonate as end user to open file
  • HDFS-10512 - VolumeScanner may terminate due to NPE in DataNode.reportBadBlocks
  • HDFS-10516 - Fix bug when warming up EDEK cache of more than one encryption zone
  • HDFS-10544 - Balancer doesn't work with IPFailoverProxyProvider.
  • HDFS-10643 - Namenode should use loginUser(hdfs) to generateEncryptedKey
  • MAPREDUCE-6442 - Stack trace is missing when error occurs in client protocol provider's constructor
  • MAPREDUCE-6473 - Job submission can take a long time during Cluster initialization
  • MAPREDUCE-6577 - MR AM unable to load native library without MR_AM_ADMIN_USER_ENV set
  • YARN-2605 - [RM HA] Rest api endpoints doing redirect incorrectly.
  • YARN-3055 - Fixed ResourceManager's DelegationTokenRenewer to not stop token renewal of applications part of a bigger workflow
  • YARN-3104 - Fixed RM to not generate new AMRM tokens on every heartbeat between rolling and activation
  • YARN-3832 - Resource Localization fails on a cluster due to existing cache directories
  • YARN-4459 - container-executor should only kill process groups
  • YARN-4784 - Fairscheduler: defaultQueueSchedulingPolicy should not accept FIFO.
  • YARN-5048 - DelegationTokenRenewer#skipTokenRenewal may throw NPE
  • YARN-5272 - Handle queue names consistently in FairScheduler.
  • HBASE-11625 - Verifies data before building HFileBlock.
  • HBASE-14155 - StackOverflowError in reverse scan
  • HBASE-14644 - Region in transition metric is broken
  • HBASE-14730 - Region server needs to log warnings when there are attributes configured for cells with hfile v2
  • HBASE-15439 - getMaximumAllowedTimeBetweenRuns in ScheduledChore ignores the TimeUnit
  • HBASE-15496 - Throw RowTooBigException only for user scan/get
  • HBASE-15707 - ImportTSV bulk output does not support tags with hfile.format.version=3
  • HBASE-15746 - Remove extra RegionCoprocessor preClose() in RSRpcServices#closeRegion
  • HBASE-15791 - Improve javadoc around ScheduledChore
  • HBASE-15811 - Batch Get after batch Put does not fetch all Cells
  • HBASE-15925 - Provide default values for hadoop compat module related properties that match default hadoop profile.
  • HBASE-16207 - Can't restore snapshot without "Admin" permission
  • HBASE-16288 - Revert "HFile intermediate block level indexes might recurse forever creating multi TB files"
  • HBASE-16288 - HFile intermediate block level indexes might recurse forever creating multi TB files
  • HIVE-7443 - Fix HiveConnection to communicate with Kerberized Hive JDBC server and alternative JDKs
  • HIVE-9499 - hive.limit.query.max.table.partition makes queries fail on non-partitioned tables
  • HIVE-10685 - Alter table concatenate oparetor will cause duplicate data
  • HIVE-10925 - Non-static threadlocals in metastore code can potentially cause memory leak
  • HIVE-11031 - ORC concatenation of old files can fail while merging column statistics
  • HIVE-11243 - Changing log level in Utilities.getBaseWork
  • HIVE-11369 - Mapjoins in HiveServer2 fail when jmxremote is used
  • HIVE-11408 - HiveServer2 is leaking ClassLoaders when add jar / temporary functions are used due to constructor caching in Hadoop ReflectionUtils
  • HIVE-11427 - Location of temporary table for CREATE TABLE SELECT broken by HIVE-7079.
  • HIVE-11747 - Unnecessary error log is shown when executing a "INSERT OVERWRITE LOCAL DIRECTORY" cmd in the embedded mode
  • HIVE-11827 - STORED AS AVRO fails SELECT COUNT(*) when empty
  • HIVE-12481 - Occasionally "Request is a replay" will be thrown from HS2
  • HIVE-12635 - Hive should return the latest hbase cell timestamp as the row timestamp value
  • HIVE-12958 - Make embedded Jetty server more configurable
  • HIVE-13285 - Orc concatenation may drop old files from moving to final path
  • HIVE-13462 - HiveResultSetMetaData.getPrecision() fails for NULL columns
  • HIVE-13527 - Using deprecated APIs in HBase client causes zookeeper connection leaks
  • HIVE-13570 - Some queries with Union all fail when CBO is off
  • HIVE-13590 - Kerberized HS2 with LDAP auth enabled fails in multi-domain LDAP case
  • HIVE-13704 - Don't call DistCp.execute() instead of DistCp.run()
  • HIVE-13736 - View's input/output formats are TEXT by default.
  • HIVE-13932 - Hive SMB Map Join with small set of LIMIT failed with NPE
  • HIVE-13953 - Issues in HiveLockObject equals method
  • HIVE-13991 - Union All on view fail with no valid permission on underneath table
  • HIVE-14006 - Hive query with UNION ALL fails with ArrayIndexOutOfBoundsException.
  • HIVE-14118 - Make the alter partition exception more meaningful
  • HUE-3520 - [jb] Fix backport error
  • HUE-3520 - [jb] Use impersonation to access JHS if security is enabled
  • HUE-3637 - [sqoop] Avoid decode errors on attribute values
  • HUE-3650 - [beeswax] Notify of caught errors in the watch logs process
  • HUE-3651 - [core] Upgrade Moment.js
  • HUE-3716 - [core] Add gen-py paths to hue.pth
  • HUE-3861 - [core] Upgrade Django Axes to 1.5
  • HUE-3866 - [core] Hue CPU reaches ~100% usage while uploading files with SSL to HTTPFS/WebHDFS
  • HUE-3880 - [core] Add importlib directly for Python 2.6
  • HUE-4005 - [oozie] Remove oozie.coord.application.path from properties when rerunning workflow
  • HUE-4006 - [oozie] Create new deployment directory when coordinator or bundle is copied
  • HUE-4007 - [oozie] Fix deployement_dir for the bundle in oozie example fixtures
  • HUE-4023 - [useradmin] update AuthenticationForm to allow activated users to login
  • HUE-4087 - [jobbrowser] Unable to kill jobs with Resource Manager HA enabled
  • HUE-4202 - [jb] Enable offset param for fetching jobbrowser logs
  • HUE-4215 - [yarn] Reset API_CACHE on logout
  • HUE-4227 - [yarn] Fix unittest for MR API Cache
  • HUE-4238 - [doc2] Ignore history docs in find_jobs_with_no_doc during sync documents
  • HUE-4252 - [core] Handle 307 redirect from YARN upon standby failover
  • HUE-4258 - [jb] Close and pool Spark History Server connections
  • HUE-4333 - [core] Properly reset API_CACHE on failover
  • HUE-4493 - [oozie] Fix sync-workflow action when Workflow includes sub-workflow
  • HUE-4515 - [oozie] Remove oozie.bundle.application.path from properties when rerunning workflow
  • OOZIE-2314 - Unable to kill old instance child job by workflow or coord rerun by Launcher
  • OOZIE-2329 - Make handling yarn restarts configurable
  • OOZIE-2330 - Spark action should take the global jobTracker and nameNode configs by default and allow file and archive elements
  • OOZIE-2345 - Parallel job submission for forked actions
  • OOZIE-2391 - spark-opts value in workflow.xml is not parsed properly
  • OOZIE-2436 - Fork/join workflow fails with oozie.action.yarn.tag must not be null
  • OOZIE-2481 - Add YARN_CONF_DIR in the Shell action
  • OOZIE-2504 - Create a log4j.properties under HADOOP_CONF_DIR in Shell Action
  • OOZIE-2511 - SubWorkflow missing variable set from option if config-default is present in parent workflow
  • OOZIE-2533 - Patch-1550 - workaround
  • OOZIE-2537 - SqoopMain does not set up log4j properly
  • SENTRY-1190 - IMPORT TABLE silently fails if Sentry is enabled
  • SENTRY-1201 - Sentry ignores database prefix for MSCK statement
  • SENTRY-1252 - grantServerPrivilege and revokeServerPrivilege should treat "*" and "ALL" as synonyms when action is not explicitly specified
  • SENTRY-1265 - Sentry service should not require a TGT as it is not talking to other kerberos services as a client
  • SENTRY-1292 - Reorder DBModelAction EnumSet
  • SENTRY-1293 - Avoid converting string permission to Privilege object
  • SOLR-6631 - DistributedQueue spinning on calling zookeeper getChildren()
  • SOLR-6879 - Have an option to disable autoAddReplicas temporarily for all collections.
  • SOLR-7178 - OverseerAutoReplicaFailoverThread compares Integer objects using ==
  • SOLR-8451 - Fix backport
  • SOLR-8497 - Merge indexes should mark its directories as done rather than keep them around in the directory cache.
  • SOLR-8551 - Make collection deletion more robust.
  • SOLR-8683 - Tune down stream closed logging
  • SOLR-9236 - AutoAddReplicas will append an extra /tlog to the update log location on replica failover.
  • SPARK-10577 - [PYSPARK] DataFrame hint for broadcast join
  • SPARK-11442 - Reduce numSlices for local metrics test of SparkListenerSuite
  • SPARK-12087 - [STREAMING] Create new JobConf for every batch in saveAsHadoopFiles
  • SQOOP-2846 - Sqoop Export with update-key failing for avro data file

Issues Fixed in CDH 5.5.4

CDH 5.5.4 fixes the following issues.

Apache Hadoop

FSImage may get corrupted after deleting snapshot

Bug: HDFS-9406

When deleting a snapshot that contains the last record of a given INode, the fsimage may become corrupt because the create list of the snapshot diff in the previous snapshot and the child list of the parent INodeDirectory are not cleaned.

Apache HBase

The ReplicationCleaner process can abort if its connection to ZooKeeper is inconsistent

Bug: HBASE-15234

If the connection with ZooKeeper is inconsistent, the ReplicationCleaner may abort, and the following event is logged by the HMaster:

WARN org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner: Aborting ReplicationLogCleaner
because Failed to get list of replicators

Unprocessed WALs accumulate.

The seekBefore() method calculates the size of the previous data block by assuming that data blocks are contiguous, and HFile v2 and higher store Bloom blocks and leaf-level INode blocks with the data. As a result, reverse scans do not work when Bloom blocks or leaf-level INode blocks are present when HFile v2 or higher is used.

Workaround: Restart the HMaster occasionally. The ReplicationCleaner restarts if necessary and process the unprocessed WALs.

Upstream Issues Fixed

The following upstream issues are fixed in CDH 5.5.4:

  • FLUME-2632 - High CPU on KafkaSink
  • FLUME-2712 - Optional channel errors slows down the Source to Main channel event rate
  • FLUME-2781 - Kafka Channel with parseAsFlumeEvent=true should write data as is, not as flume events
  • FLUME-2886 - Optional Channels can cause OOMs
  • FLUME-2891 - Revert FLUME-2712 and FLUME-2886
  • FLUME-2897 - AsyncHBase sink NPE when Channel.getTransaction() fails
  • HADOOP-7139 - Allow appending to existing SequenceFiles
  • HADOOP-7817 - RawLocalFileSystem.append() should give FSDataOutputStream with accurate .getPos()
  • HADOOP-11321 - copyToLocal cannot save a file to an SMB share unless the user has Full Control permissions
  • HADOOP-11687 - Ignore x-* and response headers when copying an Amazon S3 object
  • HADOOP-11722 - Some Instances of Services using ZKDelegationTokenSecretManager go down when old token cannot be deleted
  • HADOOP-12240 - Fix tests requiring native library to be skipped in non-native profile
  • HADOOP-12280 - Skip unit tests based on maven profile rather than NativeCodeLoader.isNativeCodeLoaded
  • HADOOP-12559 - KMS connection failures should trigger TGT renewal
  • HADOOP-12605 - Fix intermittent failure of TestIPC.testIpcWithReaderQueuing
  • HADOOP-12668 - Support excluding weak Ciphers in HttpServer2 through ssl-server.conf
  • HADOOP-12682 - Fix TestKMS#testKMSRestart* failure
  • HADOOP-12699 - TestKMS#testKMSProvider intermittently fails during 'test rollover draining'
  • HADOOP-12715 - TestValueQueue#testgetAtMostPolicyALL fails intermittently
  • HADOOP-12718 - Incorrect error message by fs -put local dir without permission
  • HADOOP-12736 - TestTimedOutTestsListener#testThreadDumpAndDeadlocks sometimes times out
  • HADOOP-12788 - OpensslAesCtrCryptoCodec should log which random number generator is used
  • HADOOP-12825 - Log slow name resolutions
  • HADOOP-12954 - Add a way to change hadoop.security.token.service.use_ip
  • HADOOP-12972 - Lz4Compressor#getLibraryName returns the wrong version number
  • HDFS-6520 - hdfs fsck passes invalid length value when creating BlockReader
  • HDFS-7373 - Clean up temporary files after fsimage transfer failures
  • HDFS-7758 - Retire FsDatasetSpi#getVolumes() and use FsDatasetSpi#getVolumeRefs() instead
  • HDFS-8211 - DataNode UUID is always null in the JMX counter
  • HDFS-8496 - Calling stopWriter() with FSDatasetImpl lock held may block other threads
  • HDFS-8576 - Lease recovery should return true if the lease can be released and the file can be closed
  • HDFS-8785 - TestDistributedFileSystem is failing in trunk
  • HDFS-8855 - Webhdfs client leaks active NameNode connections
  • HDFS-9264 - Minor cleanup of operations on FsVolumeList#volumes
  • HDFS-9289 - Make DataStreamer#block thread safe and verify genStamp in commitBlock
  • HDFS-9347 - Invariant assumption in TestQuorumJournalManager.shutdown() is wrong
  • HDFS-9350 - Avoid creating temprorary strings in Block.toString() and getBlockName()
  • HDFS-9358 - TestNodeCount#testNodeCount timed out
  • HDFS-9406 - FSImage may get corrupted after deleting snapshot
  • HDFS-9514 - TestDistributedFileSystem.testDFSClientPeerWriteTimeout failing; exception being swallowed
  • HDFS-9576 - HTrace: collect position/length information on read operations
  • HDFS-9589 - Block files which have been hardlinked should be duplicated before the DataNode appends to the them
  • HDFS-9612 - DistCp worker threads are not terminated after jobs are done
  • HDFS-9655 - NN should start JVM pause monitor before loading fsimage.
  • HDFS-9688 - Test the effect of nested encryption zones in HDFS downgrade
  • HDFS-9701 - DN may deadlock when hot-swapping under load
  • HDFS-9721 - Allow Delimited PB OIV tool to run upon fsimage that contains INodeReference
  • HDFS-9949 - Add a test case to ensure that the DataNode does not regenerate its UUID when a storage directory is cleared
  • HDFS-10223 - peerFromSocketAndKey performs SASL exchange before setting connection timeouts
  • HDFS-10267 - Extra "synchronized" on FsDatasetImpl#recoverAppend and FsDatasetImpl#recoverClose
  • MAPREDUCE-4785 - TestMRApp occasionally fails
  • MAPREDUCE-6460 - TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException fails
  • MAPREDUCE-6528 - Memory leak for HistoryFileManager.getJobSummary()
  • MAPREDUCE-6580 - Test failure: TestMRJobsWithProfiler
  • MAPREDUCE-6620 - Jobs that did not start are shown as starting in 1969 in the JHS web UI
  • YARN-2749 - Fix some testcases from TestLogAggregationService fails in trunk
  • YARN-2871 - TestRMRestart#testRMRestartGetApplicationList sometime fails in trunk
  • YARN-2902 - Killing a container that is localizing can orphan resources in the DOWNLOADING state
  • YARN-3446 - FairScheduler headroom calculation should exclude nodes in the blacklist
  • YARN-3727 - For better error recovery, check if the directory exists before using it for localization
  • YARN-4155 - TestLogAggregationService.testLogAggregationServiceWithInterval failing
  • YARN-4168 - Fixed a failing test TestLogAggregationService.testLocalFileDeletionOnDiskFull
  • YARN-4354 - Public resource localization fails with NPE
  • YARN-4380 - TestResourceLocalizationService.testDownloadingResourcesOnContainerKill fails intermittently
  • YARN-4393 - Fix intermittent test failure for TestResourceLocalizationService#testFailedDirsResourceRelease
  • YARN-4546 - ResourceManager crash due to scheduling opportunity overflow
  • YARN-4573 - Fix test failure in TestRMAppTransitions#testAppRunningKill and testAppKilledKilled
  • YARN-4613 - Fix test failure in TestClientRMService#testGetClusterNodes
  • YARN-4704 - TestResourceManager#testResourceAllocation() fails when using FairScheduler
  • YARN-4717 - TestResourceLocalizationService.testPublicResourceInitializesLocalDir fails Intermittently due to IllegalArgumentException from cleanup
  • HBASE-6617 - ReplicationSourceManager should be able to track multiple WAL paths (ADDENDUM)
  • HBASE-14586 - Use a maven profile to run Jacoco analysis
  • HBASE-14587 - Attach a test-sources.jar for hbase-server
  • HBASE-14588 - Stop accessing test resources from within src folder
  • HBASE-14759 - Avoid using Math.abs when selecting SyncRunner in FSHLog
  • HBASE-15019 - Replication stuck when HDFS is restarted
  • HBASE-15052 - Use EnvironmentEdgeManager in ReplicationSource
  • HBASE-15152 - Automatically include prefix-tree module in MR jobs if present
  • HBASE-15157 - Add *PerformanceTest for Append, CheckAnd*
  • HBASE-15206 - Fix flaky testSplitDaughtersNotInMeta
  • HBASE-15213 - Fix increment performance regression caused by HBASE-8763 on branch-1.0
  • HBASE-15234 - Don't abort ReplicationLogCleaner on ZooKeeper errors
  • HBASE-15456 - CreateTableProcedure/ModifyTableProcedure needs to fail when there is no family in table descriptor
  • HBASE-15479 - No more garbage or beware of autoboxing
  • HBASE-15582 - SnapshotManifestV1 too verbose when there are no regions
  • HIVE-9617 - UDF from_utc_timestamp throws NPE if the second argument is null
  • HIVE-9743 - Revert "(Tests portion only)Incorrect result set for vectorized left outer join (Matt McCline, reviewed by Vikram Dixit)"
  • HIVE-10115 - HS2 running on a Kerberized cluster should offer Kerberos(GSSAPI) and Delegation token(DIGEST) when alternate authentication is enabled
  • HIVE-10213 - MapReduce jobs using dynamic-partitioning fail on commit
  • HIVE-10303 - HIVE-9471 broke forward compatibility of ORC files
  • HIVE-11054 - Handle varchar/char partition columns in vectorization
  • HIVE-11097 - HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases
  • HIVE-11135 - Fix the Beeline set and save command in order to avoid the NullPointerException
  • HIVE-11285 - ObjectInspector for partition columns in FetchOperator in SMBJoin causes exception
  • HIVE-11488 - Need to add support for sessionId and queryId logging, QueryId can't be stored in the configuration of the SessionState since multiple queries can run in a single session
  • HIVE-11583 - When PTF is used over a large partitions result could be corrupted
  • HIVE-11590 - AvroDeserializer is very chatty
  • HIVE-11828 - beeline -f fails on scripts with tabs between column type and comment
  • HIVE-11866 - Add framework to enable testing using LDAPServer using LDAP protocol
  • HIVE-11919 - Hive Union Type Mismatch
  • HIVE-12315 - Fix Vectorized double divide by zero
  • HIVE-12354 - MapJoin with double keys is slow on MR
  • HIVE-12431 - Support timeout for compile lock
  • HIVE-12506 - SHOW CREATE TABLE command creates a table that does not work for RCFile format
  • HIVE-12706 - Incorrect output from from_utc_timestamp()/to_utc_timestamp when local timezone has DST
  • HIVE-12782 - Update the golden files for some tests that fail
  • HIVE-12790 - Metastore connection leaks in HiveServer2
  • HIVE-12885 - LDAP Authenticator improvements
  • HIVE-12909 - Some encryption q-tests fail because trash is disabled in encryption_with_trash.q
  • HIVE-12941 - Unexpected result when using MIN() on struct with NULL in first field
  • HIVE-12946 - Alter table should also add default scheme and authority for the location similar to create table
  • HIVE-13039 - BETWEEN predicate is not functioning correctly with predicate pushdown on Parquet table
  • HIVE-13055 - Add unit tests for HIVE-11512
  • HIVE-13065 - Hive throws NPE when writing map type data to a HBase backed table
  • HIVE-13082 - Enable constant propagation optimization in query with left semi join
  • HIVE-13200 - Aggregation functions returning empty rows on partitioned columns
  • HIVE-13243 - Hive drop table on encryption zone fails for external tables
  • HIVE-13251 - Hive can't read the decimal in AVRO file generated from previous version
  • HIVE-13286 - Query ID is being reused across queries
  • HIVE-13295 - Improvement to LDAP search queries in HS2 LDAP Authenticator
  • HIVE-13401 - Kerberized HS2 with LDAP auth enabled fails kerberos/delegation token authentication
  • HUE-3106 - [filebrowser] Add support for full paths in zip file uploads
  • HUE-3110 - [oozie] Fix bundle submission when coordinator points to multiple bundles
  • HUE-3132 - [core] Fix Sync Ldap users and groups for anonymous binds
  • HUE-3180 - [useradmin] Override duplicate username validation message
  • HUE-3185 - [oozie] Avoid extra API calls for parent information in workflow dashboard
  • HUE-3303 - [core] PostgreSQL requires data update and alter table operations in separate transactions
  • HUE-3310 - [jobsub] Prevent browsing job designs by API
  • HUE-3334 - [editor] Skip checking for multi queries if there is no semi colon, send empty query instead of error
  • HUE-3398 - [beeswax] Filter out sessions with empty guid or secret key
  • HUE-3436 - [oozie] Retain old dependencies when saving a workflow
  • HUE-3437 - [core] PamBackend does not honor ignore_username_case
  • HUE-3523 - [oozie] Modify find_jobs_with_no_doc method to exclude jobs with no name
  • HUE-3528 - [oozie] Call correct metrics api to avoid 500 error
  • HUE-3594 - [fb] Smarter DOM based XSS filter on hashes
  • IMPALA-852 - ,IMPALA-2215: Analyze HAVING clause before aggregation
  • IMPALA-1092 - Fix estimates for trivial coord-only queries
  • IMPALA-1170 - Fix URL parsing when path contains '@'
  • IMPALA-1934 - Allow shell to retrieve LDAP password from shell cmd
  • IMPALA-2093 - Disallow NOT IN aggregate subqueries with a constant lhs expr
  • IMPALA-2184 - Don't inline timestamp methods with try/catch blocks in IR
  • IMPALA-2425 - Broadcast join hint not enforced when low memory limit is set
  • IMPALA-2503 - Add missing String.format() arg in error message
  • IMPALA-2539 - Unmark collections slots of empty union operands
  • IMPALA-2554 - Change default buffer size for RPC servers and clients
  • IMPALA-2565 - Planner tests are flaky due to file size mismatches
  • IMPALA-2592 - DataStreamSender::Channel::CloseInternal() does not close the channel on an error
  • IMPALA-2599 - Pseudo-random sleep before acquiring kerberos ticket possibly not really pseudo-random
  • IMPALA-2711 - Fix memory leak in Rand()
  • IMPALA-2719 - test_parquet_max_page_header fails on Isilon
  • IMPALA-2732 - Timestamp formats with non-padded values
  • IMPALA-2734 - Correlated EXISTS subqueries with HAVING clause return wrong results
  • IMPALA-2742 - Avoid unbounded MemPool growth with AcquireData()
  • IMPALA-2749 - Fix decimal multiplication overflow
  • IMPALA-2765 - Preserve return type of subexpressions substituted in isTrueWithNullSlots()
  • IMPALA-2788 - conv(bigint num, int from_base, int to_base) returns wrong result
  • IMPALA-2798 - Bring in AVRO-1617 fix and add test case for it
  • IMPALA-2818 - Fix cancellation crashes/hangs due to BlockOnWait() race
  • IMPALA-2820 - Support unquoted keywords as struct-field names
  • IMPALA-2832 - Fix cloning of FunctionCallExpr
  • IMPALA-2844 - Allow count(*) on RC files with complex types
  • IMPALA-2870 - Fix failing metadata.test_ddl.TestDdlStatements.test_create_table test
  • IMPALA-2894 - Move regression test into a different .test file
  • IMPALA-2906 - Fix an edge case with materializing TupleIsNullPredicates in analytic sorts
  • IMPALA-2914 - Fix DCHECK Check failed: HasDateOrTime()
  • IMPALA-2926 - Fix off-by-one bug in SelectNode::CopyRows()
  • IMPALA-2940 - Fix leak of dictionaries in Parquet scanner
  • IMPALA-3000 - Fix BitReader::Reset()
  • IMPALA-3034 - Verify all consumed memory of a MemTracker is always released at destruction time
  • IMPALA-3047 - Separate create table test with nested types
  • IMPALA-3054 - Disable proble side filters when spilling
  • IMPALA-3071 - Fix assignment of On-clause predicates belonging to an inner join
  • IMPALA-3085 - Unregister data sinks' MemTrackers at their Close() functions
  • IMPALA-3093 - ReopenClient() could NULL out 'client_key' causing a crash
  • IMPALA-3095 - Add configurable whitelist of authorized internal principals
  • IMPALA-3151 - Impala crash for avro table when casting to char data type
  • IMPALA-3194 - Allow queries materializing scalar type columns in RC/sequence files
  • KITE-1114 - Kite CLI json-import HDFS temp file path not multiuser safe, fix missing license header
  • OOZIE-2419 - HBase credentials are not correctly proxied
  • OOZIE-2428 - TestSLAService, TestSLAEventGeneration flaky tests
  • OOZIE-2429 - TestEventGeneration test is flaky
  • OOZIE-2432 - TestPurgeXCommand fails
  • OOZIE-2435 - TestCoordChangeXCommand is flaky
  • OOZIE-2466 - Repeated failure of TestMetricsInstrumentation.testSamplers
  • OOZIE-2486 - TestSLAEventsGetForFilterJPAExecutor is flaky
  • OOZIE-2490 - Oozie can't set hadoop.security.token.service.use_ip
  • SENTRY-922 - BackportINSERT OVERWRITE DIRECTORY permission not working correctly
  • SENTRY-972 - backportInclude sentry-tests-hive hadoop test script in maven project
  • SENTRY-991 - backportRoles of Sentry Permission needs to be case insensitive
  • SENTRY-1002 - PathsUpdate.parsePath(path) will throw an NPE when parsing relative paths
  • SENTRY-1003 - Support "reload" by updating the classpath of Sentry function aux jar path during runtime
  • SENTRY-1007 - backportSentry column-level performance for wide tables
  • SENTRY-1008 - Path should be not be updated if the create/drop table/partition event fails
  • SENTRY-1015 - backportImprove Sentry + Hive error message when user has insufficient privileges
  • SENTRY-1044 - Tables with non-hdfs locations breaks HMS startup
  • SENTRY-1169 - MetastorePlugin#renameAuthzObject log message prints oldpathname as newpathname
  • SENTRY-1184 - Clean up HMSPaths.renameAuthzObject
  • SOLR-6820 - Make the number of version buckets used by the UpdateLog configurable as increasing beyond the default 256 has been shown to help with high volume indexing performance in SolrCloudIncrease the default number of buckets to 65536 instead of 256, fix numVersionBuckets name attribute in configsets
  • SOLR-7281 - Add an overseer action to publish an entire node as 'down'
  • SOLR-7332 - Initialize the highest value for all version buckets with the max value from the index or recent updates to avoid unnecessary lookups to the index to check for reordered updates when processing new documents
  • SOLR-7493 - Requests aren't distributed evenly if the collection isn't present locally. Merges r1683946 and r1683948 from trunk
  • SOLR-7587 - TestSpellCheckResponse stalled and never timed out -- possible VersionBucket bug?
  • SOLR-7625 - Version bucket seed not updated after new index is installed on a replica
  • SOLR-8215 - Only active replicas should handle incoming requests against a collection
  • SOLR-8371 - Try and prevent too many recovery requests from stacking up and clean up some faulty cancel recovery logic
  • SOLR-8451 - We should not call method.abort in HttpSolrClient or HttpSolrCall#remoteQuery and HttpSolrCall#remoteQuery should not close streams
  • SOLR-8453 - Solr should attempt to consume the request inputstream on errors as we cannot count on the container to do it
  • SOLR-8575 - Fix HDFSLogReader replay status numbers and a performance bug where we can reopen FSDataInputStream too often
  • SOLR-8578 - Successful or not, requests are not always fully consumed by Solrj clients and we count on HttpClient or the JVM
  • SOLR-8615 - Just like creating cores, we should use multiple threads when closing cores
  • SOLR-8633 - DistributedUpdateProcess processCommit/deleteByQuery calls finish on DUP and SolrCmdDistributor, which violates the lifecycle and can cause bugs
  • SOLR-8720 - ZkController#publishAndWaitForDownStates should use #publishNodeAsDown
  • SOLR-8771 - Multi-threaded core shutdown creates executor per core
  • SOLR-8855 - The HDFS BlockDirectory should not clean up its cache on shutdown
  • SOLR-8856 - Do not cache merge or 'read once' contexts in the hdfs block cache
  • SOLR-8857 - HdfsUpdateLog does not use configured or new default number of version buckets and is hard coded to 256
  • SOLR-8869 - Optionally disable printing field cache entries in SolrFieldCacheMBean
  • SPARK-10859 - [SQL] Fix stats of StringType in columnar cache
  • SPARK-10914 - UnsafeRow serialization breaks when two machines have different Oops size
  • SPARK-11009 - [SQL] Fix wrong result of Window function in cluster mode
  • SPARK-11537 - [SQL] Fix negative hours/minutes/seconds
  • SPARK-11737 - [SQL] Fix serialization of UTF8String with Kyro
  • SPARK-12617 - [PYSPARK] Move Py4jCallbackConnectionCleaner to Streaming, clean up the leak sockets of Py4J
  • SPARK-14477 - [BUILD] Allow custom mirrors for downloading artifacts in build/mvn
  • SQOOP-2847 - Sqoop --incremental + missing parent --target-dir reports success with no data

Issues Fixed in CDH 5.5.2

Known Issues Fixed

The following topics describe known issues fixed in CDH 5.5.2.

Apache Spark

Spark SQL cannot retrieve data from a partitioned Hive table

When reading from a partitioned Hive table, Spark SQL is not able to identify the column delimiter used, and reads the full record as the first column entry.

Workaround: None.

When using Spark on YARN, the driver reports misleading error messages
The Spark driver reports misleading error messages such as:
ERROR ErrorMonitor: AssociationError [akka.tcp://sparkDriver@...] ->
[akka.tcp://sparkExecutor@...]: Error [Association failed with [akka.tcp://sparkE xecutor@...]]
[akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkExecutor@...]]

Workaround: Add the following property to the Spark log4j configuration file: log4j.logger.org.apache.spark.rpc.akka.ErrorMonitor=FATAL. See Configuring Spark Application Logging Properties.

Spark does not support rolling upgrades

Spark does not support rolling upgrades. Submitted Spark jobs may fail during upgrade. Jobs requiring new configuration properties will fail.

Workaround: Finish the upgrade, and then relaunch the Spark jobs.

Hue

Cannot query the customers table in Hue

To query the customers table, you must re-create the Parquet data for compatibility.

Bug: HUE-3040

Workaround: Update the parquet file of the customers table (/user/hive/warehouse/customers/customers) with the one attached to HUE-3040.

Upstream Issues Fixed

The following upstream issues are fixed in CDH 5.5.2:

  • AVRO-1781 - Remove LogicalTypes cache
  • HADOOP-7713 - dfs -count -q should label output column
  • HADOOP-10406 - TestIPC.testIpcWithReaderQueuing may fail
  • HADOOP-10668 - Addendum patch to fix TestZKFailoverController
  • HADOOP-10668 - TestZKFailoverControllerStress#testExpireBackAndForth occasionally fails
  • HADOOP-11171 - Enable using a proxy server to connect to S3a
  • HADOOP-11218 - Add TLS 1.1, TLS 1.2 to KMS, HttpFS, SSLFactory
  • HADOOP-12269 - Update aws-sdk dependency to 1.10.6
  • HADOOP-12417 - TestWebDelegationToken failing with port in use
  • HADOOP-12418 - TestRPC.testRPCInterruptedSimple fails intermittently
  • HADOOP-12464 - Interrupted client may try to fail-over and retry
  • HADOOP-12468 - Partial group resolution failure should not result in user lockout
  • HADOOP-12474 - MiniKMS should use random ports for Jetty server by default
  • HADOOP-12568 - Update core-default.xml to describe posixGroups support
  • HADOOP-12573 - TestRPC.testClientBackOff failing
  • HADOOP-12584 - Disable browsing the static directory in HttpServer2
  • HADOOP-12584 - Revert - Disable browsing the static directory in HttpServer2
  • HADOOP-12584 - Disable browsing the static directory in HttpServer2
  • HADOOP-12604 - Exception may be swallowed in KMSClientProvider
  • HADOOP-12625 - Add a config to disable the /logs endpoints
  • HDFS-6101 - TestReplaceDatanodeOnFailure fails occasionally
  • HDFS-6533 - TestBPOfferService#testBasicFunctionalitytest fails intermittently
  • HDFS-6694 - TestPipelinesFailover.testPipelineRecoveryStress tests fail intermittently with various symptoms - debugging patch
  • HDFS-7553 - Fix the TestDFSUpgradeWithHA due to BindException
  • HDFS-7798 - Checkpointing failure caused by shared KerberosAuthenticator
  • HDFS-8647 - Abstract BlockManager's rack policy into BlockPlacementPolicy
  • HDFS-8722 - Optimize DataNode writes for small writes and flushes
  • HDFS-8772 - Fix TestStandbyIsHot#testDatanodeRestarts which occasionally fails
  • HDFS-8805 - Archival Storage: getStoragePolicy should not need superuser privilege
  • HDFS-9083 - Replication violates block placement policy
  • HDFS-9123 - Copying from the root to a subdirectory should be forbidden
  • HDFS-9160 - [OIV-Doc] : Missing details of 'delimited' for processor options
  • HDFS-9220 - Reading small file (< 512 bytes) that is open for append fails due to incorrect checksum
  • HDFS-9249 - NPE is thrown if an IOException is thrown in NameNode constructor
  • HDFS-9250 - Add precondition check to LocatedBlock#addCachedLoc
  • HDFS-9268 - fuse_dfs chown crashes when uid is passed as -1
  • HDFS-9273 - ACLs on root directory may be lost after NameNode restart
  • HDFS-9286 - HttpFs does not parse ACL syntax correctly for operation REMOVEACLENTRIES
  • HDFS-9295 - Add a thorough test of the full KMS code path
  • HDFS-9313 - Possible NullPointerException in BlockManager if no excess replica can be chosen
  • HDFS-9332 - Fix Precondition failures from NameNodeEditLogRoller while saving namespace
  • HDFS-9339 - Extend full test of KMS ACLs
  • HDFS-9364 - Unnecessary DNS resolution attempts when creating NameNodeProxies
  • HDFS-9410 - Some tests should always reset sysout and syserr
  • HDFS-9429 - Tests in TestDFSAdminWithHA intermittently fail with EOFException
  • HDFS-9438 - Only collect HDFS-6694 debug data on Linux, Mac, and Solaris
  • HDFS-9445 - DataNode may deadlock while handling a bad volume
  • HDFS-9470 - Encryption zone on root not loaded from fsimage after NameNode restart
  • HDFS-9474 - TestPipelinesFailover should not fail when printing debug message
  • MAPREDUCE-6191 - Improve clearing stale state of Java serialization testcase
  • MAPREDUCE-6233 - org.apache.hadoop.mapreduce.TestLargeSort.testLargeSort failed in trunk
  • MAPREDUCE-6549 - Multibyte delimiters with LineRecordReader cause duplicate records
  • MAPREDUCE-6550 - archive-logs tool changes log ownership to the Yarn user when using DefaultContainerExecutor
  • YARN-3564 - Fix TestContainerAllocation.testAMContainerAllocationWhenDNSUnavailable fails randomly
  • YARN-3768 - ArrayIndexOutOfBoundsException with empty environment variables
  • YARN-4235 - FairScheduler PrimaryGroup does not handle empty groups returned for a user
  • YARN-4310 - FairScheduler: Log skipping reservation messages at DEBUG level
  • YARN-4347 - Resource manager fails with Null pointer exception
  • YARN-4408 - Fix issue that NodeManager still reports negative running containers
  • HBASE-6617 - ReplicationSourceManager should be able to track multiple WAL paths
  • HBASE-12961 - Fix negative values in read and write region server metrics
  • HBASE-13134 - mutateRow and checkAndMutate apis don't throw region level exceptions
  • HBASE-13703 - ReplicateContext should not be a member of ReplicationSource
  • HBASE-13746 - list_replicated_tables command is not listing table in HBase shell
  • HBASE-13833 - LoadIncrementalHFile.doBulkLoad(Path, HTable) does not handle unmanaged connections when using SecureBulkLoad
  • HBASE-14003 - Work around JDK-8044053
  • HBASE-14205 - RegionCoprocessorHost System.nanoTime() performance bottleneck
  • HBASE-14283 - Reverse scan doesn’t work with HFile inline index/bloom blocks
  • HBASE-14501 - NPE in replication with TDE
  • HBASE-14533 - Connection Idle time 1 second is too short and the connection is closed too quickly by the ChoreService. Increase it to the default (10 minutes) for testAll(). The patch is not committed upstream yet.
  • HBASE-14541 - TestHFileOutputFormat.testMRIncrementalLoadWithSplit failed due to too many splits and few retries
  • HBASE-14547 - Add more debug/trace to zk-procedure
  • HBASE-14621 - ReplicationLogCleaner stuck on RS crash
  • HBASE-14731 - Add -DuseMob option to ITBLL
  • HBASE-14809 - Grant / revoke namespace admin permission to group
  • HBASE-14923 - VerifyReplication should not mask the exception during result comparison
  • HBASE-14926 - Hung ThriftServer; no timeout on read from client; if client crashes, worker thread gets stuck reading
  • HBASE-15031 - Fix merge of MVCC and SequenceID performance regression in branch-1.0
  • HBASE-15032 - HBase shell scan filter string assumes UTF-8 encoding
  • HBASE-15035 - Bulkloading HFiles with tags that require splits do not preserve tags
  • HBASE-15104 - Occasional failures due to NotServingRegionException in IT tests
  • HIVE-7575 - GetTables thrift call is very slow
  • HIVE-7653 - Hive AvroSerDe does not support circular references in Schema
  • HIVE-9507 - Make "LATERAL VIEW inline(expression) mytable" tolerant to nulls
  • HIVE-10027 - Use descriptions from Avro schema files in column comments
  • HIVE-10048 - JDBC - Support SSL encryption regardless of Authentication mechanism
  • HIVE-10083 - SMBJoin fails in case one table is uninitialized
  • HIVE-10265 - Hive CLI crashes on != inequality
  • HIVE-10514 - Fix MiniCliDriver tests failure
  • HIVE-10687 - AvroDeserializer fails to deserialize evolved union fields
  • HIVE-10697 - ObjectInspectorConvertors#UnionConvertor does a faulty conversion
  • HIVE-11149 - Fix issue with sometimes HashMap in PerfLogger.java hangs
  • HIVE-11288 - Backport:Avro SerDe InstanceCache returns incorrect schema
  • HIVE-11513 - AvroLazyObjectInspector could handle empty data better
  • HIVE-11616 - DelegationTokenSecretManager reuses the same objectstore, which has concurrency issues
  • HIVE-11785 - Revert - Support escaping carriage return and new line for LazySimpleSerDe
  • HIVE-11785 - Support escaping carriage return and new line for LazySimpleSerDe
  • HIVE-11826 - 'hadoop.proxyuser.hive.groups' configuration does not prevent unauthorized user to access metastore
  • HIVE-11977 - Hive should handle an external Avro table with zero length files present
  • HIVE-12008 - Hive queries failing when using count(*) on column in view
  • HIVE-12058 - Change Hive script to record errors when calling hbase fails
  • HIVE-12188 - DoAs does not work properly in non-Kerberos secured HiveServer2
  • HIVE-12189 - The list in pushdownPreds of ppd.ExprWalkerInfo should not be allowed to grow very large
  • HIVE-12218 - Unable to create a like table for an HBase-backed table
  • HIVE-12250 - Zookeeper connection leaks in Hive's HBaseHandler
  • HIVE-12265 - Generate lineage info only if requested
  • HIVE-12268 - Context leaks deleteOnExit paths
  • HIVE-12278 - Skip logging lineage for explain queries
  • HIVE-12287 - Lineage for lateral view shows wrong dependencies
  • HIVE-12330 - Fix precommit Spark test part2
  • HIVE-12365 - Added resource path is sent to cluster as an empty string when externally removed
  • HIVE-12378 - Exception on HBaseSerDe.serialize binary field
  • HIVE-12388 - GetTables cannot get external tables when TABLE type argument is given
  • HIVE-12406 - HIVE-9500 introduced incompatible change to LazySimpleSerDe public interface
  • HIVE-12418 - HiveHBaseTableInputFormat.getRecordReader() causes Zookeeper connection leak
  • HIVE-12505 - Backport: Insert overwrite in same encrypted zone silently fails to remove some existing files
  • HIVE-12566 - Incorrect result returns when using COALESCE in WHERE condition with LEFT JOIN
  • HIVE-12713 - Miscellaneous improvements in driver compile and execute logging
  • HIVE-12784 - Group by SemanticException: Invalid column reference
  • HIVE-12788 - Setting hive.optimize.union.remove to TRUE will break UNION ALL with aggregate functions
  • HIVE-12795 - Vectorized execution causes ClassCastException
  • HUE-2664 - Revert - [jobbrowser] Fix fetching logs from job history server
  • HUE-2997 - [oozie] Easier usage of email action when workflow fails
  • HUE-3035 - [beeswax] Optimize sample data query for partitioned tables
  • HUE-3036 - [beeswax] Revert get_tables to use Thrift API GetTables
  • HUE-3091 - [oozie] Do not remove extra new lines from email action body
  • IMPALA-1459 - Fix migration/assignment of On-clause predicates inside inline views.
  • IMPALA-2103 - Fix flaky test_impersonation test
  • IMPALA-2113 - Handle error when distinct and aggregates are used with a having clause
  • IMPALA-2225 - Handle error when star based select item and aggregate are incorrectly used
  • IMPALA-2226 - Throw AnalysisError if table properties are too large
  • IMPALA-2273 - Make MAX_PAGE_HEADER_SIZE configurable
  • IMPALA-2473 - Reduce scanner memory usage
  • IMPALA-2535 - PAGG hits mem_limit when switching to I/O buffers
  • IMPALA-2558 - DCHECK in Parquet scanner after block read error
  • IMPALA-2559 - Fix check failed: sorter_runs_.back()->is_pinned_
  • IMPALA-2591 - DataStreamSender::Send() does not return an error status if SendBatch() failed
  • IMPALA-2598 - Re-enable SSL and Kerberos on server-server
  • IMPALA-2612 - Free local allocations once for every row batch when building hash tables
  • IMPALA-2614 - Don't ignore Status returned by DataStreamRecvr::CreateMerger()
  • IMPALA-2624 - Increase fs.trash.interval to 24 hours for test suite
  • IMPALA-2630 - Skip TestParquet.test_continue_on_error when using old aggs/joins
  • IMPALA-2643 - Prevent migrating incorrectly inferred identity predicates into inline views
  • IMPALA-2648 - Avoid sending large partition stats objects over thrift
  • IMPALA-2695 - Fix GRANTs on URIs with uppercase letters
  • IMPALA-2722 - Free local allocations per row batch in non-partitioned AGG and HJ
  • IMPALA-2731 - Refactor MemPool usage in HBase scan node
  • IMPALA-2747 - Thrift-client cleans openSSL state before using it in the case of the catalog
  • IMPALA-2776 - Remove escapechartesttable and associated tests
  • IMPALA-2812 - Remove additional test referencing escapecharstesttable
  • IMPALA-2829 - SEGV in AnalyticEvalNode touching NULL input_stream_
  • KITE-1089 - ReadAvroContainer morphline command should work even if the Avro writer schema of each input file is different
  • KITE-1097 - Add method to read the name of a Morphline command
  • OOZIE-2030 - Configuration properties from global section is not getting set in Hadoop job conf when using sub-workflow action in Oozie workflow.xml
  • OOZIE-2365 - Oozie fails to start when SMTP password not set
  • OOZIE-2380 - Oozie Hive action failed with wrong tmp path
  • OOZIE-2397 - LAST_ONLY and NONE don't properly handle READY actions
  • OOZIE-2413 - Kerberos credentials can expire if the KDC is slow to respond
  • OOZIE-2439 - FS Action no longer uses name-node from global section or default NN
  • OOZIE-2441 - SubWorkflow action with propagate-configuration but no global section throws NPE on submit
  • PIG-3641 - Split "otherwise" producing incorrect output when combined with ColumnPruning
  • SENTRY-565 - Improve performance of filtering Hive SHOW commands
  • SENTRY-835 - Drop table leaves a connection open when using metastorelistener
  • SENTRY-902 - SimpleDBProviderBackend should retry the authorization process properly
  • SENTRY-936 - getGroup and getUser should always return original HDFS values for paths in prefix which are not managed by Sentry
  • SENTRY-944 - Setting HDFS rules on Sentry-managed HDFS paths should not affect original HDFS rules
  • SENTRY-953 - External Partitions which are referenced by more than one table can cause some unexpected behavior with Sentry HDFS sync
  • SENTRY-957 - Exceptions in MetastoreCacheInitializer should probably not prevent HMS from starting up
  • SENTRY-960 - Blacklist reflect, java_method using hive.server2.builtin.udf.blacklist
  • SENTRY-988 - Let SentryAuthorization setter path always fall through and update HDFS
  • SENTRY-994 - SentryAuthorizationInfoX should override isSentryManaged
  • SOLR-6443 - backportDisable test that fails on Jenkins until we can determine the problem
  • SOLR-7049 - LIST Collections API call should be processed directly by the CollectionsHandler instead of the OverseerCollectionProcessor
  • SOLR-7989 - After a new leader is elected it, it should ensure it's state is ACTIVE if it has already registered with ZooKeeper
  • SOLR-8075 - Fix faulty implementation
  • SOLR-8152 - Overseer Task Processor/Queue can miss responses, leading to timeouts
  • SOLR-8223 - Avoid accidentally swallowing OutOfMemoryError
  • SOLR-8288 - DistributedUpdateProcessor#doFinish should explicitly check and ensure it does not try to put itself into LIR
  • SOLR-8353 - Support regex for skipping license checksums
  • SOLR-8367 - Fix the LeaderInitiatedRecovery 'all replicas participate' fail-safe
  • SOLR-8372 - backportCanceled recovery can lead to data loss
  • SOLR-8535 - Support forcing define-lucene-javadoc-url to be local
  • SPARK-5569 - [STREAMING] Fix ObjectInputStreamWithLoader for supporting load array classes
  • SPARK-8029 - Robust shuffle writer
  • SPARK-9735 - [SQL] Respect the user specified schema than the infer partition schema for HadoopFsRelation
  • SPARK-10648 - Oracle dialect to handle nonspecific numeric types
  • SPARK-10865 - [SPARK-10866] [SQL] Fix bug of ceil/floor, which should returns long instead of the Double type
  • SPARK-11105 - [YARN] Distribute log4j.properties to executors
  • SPARK-11126 - [SQL] Fix the potential flaky test
  • SPARK-11126 - [SQL] Fix a memory leak in SQLListener._stageIdToStageMetrics
  • SPARK-11246 - [SQL] Table cache for Parquet broken in 1.5
  • SPARK-11453 - [SQL] Append data to partitioned table will messes up the result
  • SPARK-11484 - [WEBUI] Using proxyBase set by Spark AM
  • SPARK-11786 - [CORE] Tone down messages from akka error monitor
  • SPARK-11799 - [CORE] Make it explicit in executor logs that uncaught exceptions are thrown during executor shutdown
  • SPARK-11929 - [CORE] Make the repl log4j configuration override the root logger
  • SQOOP-2745 - Using datetime column as a splitter for Oracle no longer works
  • SQOOP-2767 - Test is failing SystemImportTest
  • SQOOP-2783 - Query import with parquet fails on incompatible schema
  • SQOOP-2422 - Sqoop2: Test TestJSONIntermediateDataFormat is failing on JDK8

Issues Fixed in CDH 5.5.1

The following issues have been fixed in CDH 5.5.1:

Apache Commons Collections deserialization vulnerability

Cloudera has learned of a potential security vulnerability in a third-party library called the Apache Commons Collections. This library is used in products distributed and supported by Cloudera (“Cloudera Products”), including core Apache Hadoop. The Apache Commons Collections library is also in widespread use beyond the Hadoop ecosystem. At this time, no specific attack vector for this vulnerability has been identified as present in Cloudera Products.

In an abundance of caution, we are currently in the process of incorporating a version of the Apache Commons Collections library with a fix into the Cloudera Products. In most cases, this will require coordination with the projects in the Apache community. One example of this is tracked by HADOOP-12577.

The Apache Commons Collections potential security vulnerability is titled “Arbitrary remote code execution with InvokerTransformer” and is tracked by COLLECTIONS-580. MITRE has not issued a CVE, but related CVE-2015-4852 has been filed for the vulnerability. CERT has issued Vulnerability Note #576313 for this issue.

Releases affected: CDH 5.5.0, CDH 5.4.8 and lower, CDH 5.3.8 and lower, CDH 5.2.8 and lower, CDH 5.1.7 and lower, Cloudera Manager 5.5.0, Cloudera Manager 5.4.8 and lower, Cloudera Manager 5.3.8 and lower, and Cloudera Manager 5.2.8 and lower, Cloudera Manager 5.1.6 and lower, Cloudera Manager 5.0.7 and lower, Cloudera Navigator 2.4.0, Cloudera Navigator 2.3.8 and lower.

Users affected: All

Impact: This potential vulnerability may enable an attacker to execute arbitrary code from a remote machine without requiring authentication.

Immediate action required: Upgrade to Cloudera Manager 5.5.1 and CDH 5.5.1, Cloudera Manager 5.4.9 and CDH 5.4.9, Cloudera Manager 5.3.9 and CDH 5.3.9, and Cloudera Manager 5.2.9 and CDH 5.2.9, and Cloudera Manager 5.1.7 and CDH 5.1.7, and Cloudera Manager 5.0.8 and CDH 5.0.8.

Apache HBase

Data may not be replicated to slave cluster if multiwal multiplicity is set to greater than 1.

Issues Fixed in CDH 5.5.0

Apache Flume

Fix FD leak in AsyncHBaseSink.

Bug: FLUME-2738

Fix for Kerberos configuration error when using short names.

Bug: FLUME-2749

Fix NullPointerException in KafkaSourceCounter.

Bug: FLUME-2672

Fix for Tail Directory Source FileNotFoundException.

Bug: FLUME-2773

Fix for Kafka Channel timeout property handling.

Bug: FLUME-2734

Apache Hadoop

YARN/MapReduce

Incorrect headroom leads to deadlock between mappers and reducers.
Blacklisting Support for Scheduling ApplicationMasters

When an ApplicationMaster fails, and the NodeManager on the same host has not yet been blacklisted, the framework should route the second ApplicationMaster attempt to a NodeManager on a different host.

Bug: YARN-2005

Apache HBase

HBase mlock agent is not included as part of CDH 5.x releases.

Starting in CDH 5.0, the native mlock daemons were not included in CDH HBase. CDH 5.5 restores the daemon in both parcels and packages.

Bug: None.

Apache Hive

Parquet Predicate pushdown for float types does not work

The Parquet predicate builder should use PrimitiveTypeName type to construct a predicate leaf instead of the type provided by PredicateLeaf.

Bug: HIVE-11504

LineageCtx should release all resources at clear()

Some maps are not released with the clear() method and can cause a memory leak.

Bug: HIVE-12225

LEFT JOIN query plan outputs wrong column when using subquery

Incorrect results may arise if a LEFT OUTER JOIN is combined with a subquery.

Bug: HIVE-9613

Map-side aggregation is extremely slow

Map-side aggregation on columns with double type is extremely slow due to HIVE-7041.

Bug: HIVE-11502

Lineage does not work with dynamic partitioning query

An error message displays after running a dynamic partitioning query: ERROR : Result schema has 2 fields, but we don't get as many dependencies.

Bug: HIVE-11834

DROP PARTITION in encrypted zone does not remove data from HDFS

An ALTER TABLE query to DROP PARTITION removes the partition metadata from HDFS but not the data.

Bug: HIVE-10910

DROP TABLE with qualified table name ignores database name when checking partitions

DDLTask.dropTable() uses an older version of Hive.getPartitionNames(), which takes in a single string for the table name, instead of the database and table names.

Bug: HIVE-10421

INSERT INTO statement may expose data that should be encrypted

INSERT INTO <table> VALUES() uses a temporary table; the data in temporary tables is stored under hive.exec.scratchdir which is not usually encrypted.

Bug: HIVE-10658

Whitelist restrictions do not get initialized in new copy of HiveConf

Whitelist restrictions use a regex pattern in HiveConf, but when a new HiveConf object copy is created, the regex pattern is not initialized in the new HiveConf copy.

Bug: HIVE-10465

RuntimeException when vectorization is enabled with binary data

A RuntimeException is thrown when vectorization is enabled and binary data is in the GROUP BY clause.

Bug: HIVE-9908

Hive may return wrong results in some queries with PTF function

The select statement has an extra column with a PTF operator that is skewing results.

Bug: HIVE-11604

HiveServer2 leaks Hive Metastore Connections

HiverServer2 uses threadlocal to cache Hive Metastore (HMS) Thrift client in class Hive. When the thread dies, the HMS client does not close. So the connection to the HMS client leaks.

Bug: HIVE-10956

Remote Spark Client has a memory leak

In Remote Spark Client (RSC), MapWork/ReduceWork tasks build up until an OutOfMemoryException is thrown.

Bug: HIVE-10006

Replication factor is not properly set in SparkHashTableSinkOperator

The default replication factors (3) affects the Map Join performance of small files.

Bug: HIVE-11109

Hive LDAP Authenticator should allow users to set Domain without the base Distinguished Name

When the base distinguished name (DN) is not configured but only the Domain has been set in hive-site.xml, the LDAP authentication provider cannot locate the user in the directory. Authentication fails in such cases.

Bug: HIVE-12007

Hive should support additional LDAP authentication parameters

Currently, Hive only has the following authenticator parameters for LDAP authentication for HiveServer2:
<property>
  <name>hive.server2.authentication</name>
  <value>LDAP</value>
</property>
<property>
  <name>hive.server2.authentication.ldap.url</name>
  <value>ldap://our_ldap_address</value>
</property>
Other LDAP properties need to be included as part of Hive-LDAP authentication, for example:
group search base -> dc=domain,dc=com
group search filter -> member={0}
user search base -> dc=domain,dc=com
user search filter -> sAMAAccountName={0}
list of valid user groups -> group1,group2,group3 

Bug: HIVE-7193

Aggregate functions used as window functions can fail in various ways

  • HIVE-11817 : Window function max() fails with NullPointerException.
  • HIVE-10702 : COUNT(*) over windowing x preceding and y preceding returns unexpected results.
  • HIVE-10826 : Support min()/max() functions over x preceding and y preceding windowing.

Apache Spark

Attempts to access secure HBase from Spark executors fail when authenticating to the metastore.

An exception like the following occurs when you attempt to access kerberized HBase instance from a Spark executor.
GSSException: No valid credentials provided
(Mechanism level: Failed to find any Kerberos tgt)
The root cause is that the HBase Kerberos authentication token is not sent to the Spark executor.

Bug: SPARK-6918

Workaround: None.

The shuffle service fails on NodeManager restarts and kills all running Spark applications

In CDH 5.4.0 through CDH 5.4.4, the shuffle service is on by default. Because it fails in NodeManager restarts, in CDH 5.4.5, and higher, the shuffle service is off by default. Dynamic allocation requires that the shuffle service be turned on.

Bug: SPARK-9439

Workaround: In CDH 5.4.5 and higher, enable the shuffle service when using dynamic allocation.

Spark not automatically picking up hive-site.xml

When you run Spark on YARN, the client hive-site.xml does not get picked up automatically by spark-submit.

Bug: SPARK-2669

Workaround: Do one of the following, depending on which deployment mode you are running in:
  • Client - set HADOOP_CONF_DIR to /etc/hive/conf/ (or the directory where hive-site.xml is located).
  • Cluster - add --files=/etc/hive/conf/hive-site.xml (or the path for hive-site.xml) to the spark-submit script.

Apache Sentry (incubating)

Synchronize calls in SentryClient and create Sentry client once per request in SimpleDBProvider

Adds proper locking to the SentryClient and reduces the number of SentryClients created within a single request in the SimpleDbProvider (used by Hive). This fixes issues that may have caused transient permission failures and out of memory conditions.

Bug: SENTRY-893

Sentry-HDFS sync events should treat database and table names as case-insensitive

Sentry-HDFS Sync was treating database and table names as case-sensitive. This led to incorrect or missing ACLs being applied as part of the sync operation if the DDL operations used a different case for the catalog objects.

Bug: SENTRY-885

Hive drop database operation removes the Sentry privileges, even if drop operation fails

Even if the Hive drop database operation fails, the Sentry privileges on that database will be removed.

Bug: SENTRY-669

Nested queries in Hive on views incorrectly enforce base table privileges instead of view privileges

Nested queries in Hive on views incorrectly enforce base table privileges instead of view privileges. This leads to to Hive query failures due to insufficient privileges.

Bug: SENTRY-619

Apache ZooKeeper

BinaryInputArchive readString should check length before allocating memory

This fixed a possible OutOfMemoryError when malformed packets were sent to the ZooKeeper server.

Bug: ZOOKEEPER-2146

Workaround:Upgrade to CDH 5.5.

Cloudera Search

The GoLive Function Does not Support Running As a Configurable User

After using --go-live mode with the MapReduceIndexerTool and HBaseMapReduceIndexerTool, depending on group mappings and the configured HDFS umask, Solr may not have been able to read the results of the indexing job.

With Search for CDH 5.5 and later, the MapReduceIndexerTool and HBaseMapReduceIndexerTool includes updated --go-live functionality. The indexers now automatically update HDFS ACLs for the specified output directory, giving Solr permission to read the indexer results.

See MapReduceIndexerTool and HBaseMapReduceIndexerTool for more information.

Bug: None.

Workaround: Do not use the --golive mode with MapReduceIndexerTool and HBaseMapReduceIndexerTool or use a less restrictive umask.

MapReduceIndexerTool fails to Index Documents When Sentry Is Enabled

Prior to CDH 5.5, when Sentry was enabled, the MapReduceIndexerTool was unable to index data even if the user was authorized to write to the collection according to Sentry permissions. This limitation occurred because, by default, the MapReduceIndexerTool used the underlying collection's solrconfig.xml from ZooKeeper to build the index using its EmbeddedSolrServers. But the embedded servers are not properly configured to use Sentry, so this process failed.

With Search for CDH 5.5, the MapReduceIndexerTool uses a default solrconfig.xml that is appropriate for the vast majority of collection configurations. With this configuration, the MapReduceIndexerTool is able to index data, even if Sentry is enabled. Note that this default configuration does not include any updateRequestProcessorChains; if your configuration requires an updateRequestProcessorChain, you can tell the MapReduceIndexerTool to use the configuration from ZooKeeper by specifying --use-zk-solrconfig.xml or from local disk by specifying --solr-home-dir.

Bug: None.

Workaround: To address this issue, configure the MapReduceIndexerTool to run without Sentry restrictions. This does not compromise security because this only affects the "embedded" Solr Servers in the job that are used to build the offline index; Solr's Sentry permissions are still checked when the data is merged into the cluster via --go-live.

Here are two ways to enable indexing:

  1. If your environment uses the default configuration files, use solrconfig.xml for indexing jobs, rather than solrconfig.xml.secure. Use the --solr-home-diroption to specify the directory containing solrconfig.xml, causing the job to run with Sentry disabled.
  2. Alternately, you can comment out the following line:
    <str name="update.chain">updateIndexAuthorization</str>

    This line must be commented out and the change saved in the solrconfig file used by the machine running the indexing job.