Fixed Issues in CDH 6.2.0

CDH 6.2.0 fixes the following issues:

Hue allows unsigned SAML assertions
Hue external users granted super user priviliges in C6
Spark’s stage retry logic could result in duplicate data
Spark’s stage retry logic could result in missing data
Shuffle+Repartition on a DataFrame could lead to incorrect answers
Shuffle+Repartition on an RDD could lead to incorrect answers
Inconsistent rows returned from queries in Kudu
Timestamp type-casted to varchar in a binary predicate can produce incorrect result
XSS Cloudera Manager
CVE-2018-1296 Permissive Apache Hadoop HDFS listXAttr Authorization Exposes Extended Attribute Key/Value Pairs
Kafka JMX Tool Cannot Connect to JMX
Kafka Broker Fails to Start Due to Slow Sentry and HMS startup
Hadoop LdapGroupsMapping does not support LDAPS for self-signed LDAP server
Upstream Issues Fixed

Hue allows unsigned SAML assertions

If Hue receives an unsigned assertion, it continues to process it as valid. This means it is possible for an end-user to forge or remove the signature and manipulate a SAML assertion to gain access without a successful authentication.

Products affected: Hue, CDH

Releases affected:

CDH 5.15.x and earlier
CDH 5.16.0, 5.16.1
CDH 6.0.x
CDH 6.1.x

User affected: All users who are using SAML with Hue.

CVE: CVE-2019-14775

Date/time of detection: January 2019

Detected by: Joel Snape

Severity (Low/Medium/High): High

Impact:

This is a significant security risk as it allows anyone to fake their access validity and therefore access Hue, even if they should not have access. In more detail: if Hue receives an unsigned assertion, it continues to process it as valid. This means it is possible for an end-user to forge or remove the signature and manipulate a SAML assertion to gain access without a successful authentication.

CVE: CVE-2019-14775

Immediate action required:

Upgrade (recommended): Upgrade to a version of CDH containing the fix.
Workaround: None

Addressed in release/refresh/patch:

CDH 5.16.2
CDH 6.2.0

Hue external users granted super user priviliges in C6

When using either the LdapBackend or the SAML2Backend authentication backends in Hue, users that are created on login when logging in for the first time are granted superuser privileges in CDH 6. This does not apply to users that are created through the User Admin application in Hue.

Products affected: Hue

Releases affected: CDH 6.0.0, CDH 6.0.1, CDH 6.1.0

Users affected: All user

Date/time of detection: Dec/12/18

Severity (Low/Medium/High): Medium

Impact:

The superuser privilege is granted to any user that logs in to Hue when LDAP or SAML authentication is used. For example, if you have the create_users_on_login property set to true in the Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini, and you are using LDAP or SAML authentication, a user that logs in to Hue for the first time is created with superuser privileges and can perform the following actions:

When the SAML2Backend is used, Hue accounts that have superuser privileges can:

Create/Delete users and groups
Assign users to groups
Alter group permissions

However, when the SAML2Backend is used, users can only log in to Hue using SAML authentication.

When the LdapBackend is used, Hue accounts that have superuser privileges can:

Synchronize Hue users with your LDAP server
Create local users and groups (these local users can login to Hue only if the mode of multi-backend authentication is set up as LdapBackend and AllowFirstUserDjangoBackend)
Assign users to groups
Alter group permissios

This impact does not apply to the following other scenarios:

When users are synced with your LDAP server manually by using the User Admin page in Hue.
When you are using other authentication methods. For example:
- AllowFirstUserDjangoBackend
- Spnego
- PAM
- Oauth

When the LdapBackend and AllowFirstUserDjangoBackend are used, administrators should note:

Local users, including users created by unexpected superusers, can login throug AllowFirstUserDjangoBackend.
Local users in Hue that created as hive, hdfs, or solr have privileges to access protected data and alter permissions in security app.
Removing the AllowFirstUserDjangoBackend authentication backend can stop local users login to Hue, but it requires the administrator to have Cloudera Manager access

CVE: CVE-2019-7319

Immediate action required: Upgrade and follow the instructions below.

Addressed in release/refresh/patch: CDH 6.1.1 and CDH 6.2.0

After upgrading to 6.1.1 or later, you must run the following update statement in the Hue database:

UPDATE useradmin_userprofile SET `creation_method` = 'EXTERNAL' WHERE `creation_method` = 'CreationMethod.EXTERNAL';

Important: If the Hue database is using MySQL, before you run this UPDATE statement, check if safe mode is on by using the following query:

SELECT @@SQL_SAFE_UPDATES;

If the safe mode is turned on, it returns '1'. You can tempirarily set it to off by using the following SET statement:

SET SQL_SAFE_UPDATES = 0;

After running the update statement, to re-enable safe mode:

SET SQL_SAFE_UPDATES = 1;

After executing the UPDATE statement, new Hue users are no longer automatically created as superusers.

To find out the list of superusers, run SQL query:

SELECT username FROM auth_user WHERE superuser = 1;

Users who obtained superuser privilege due to this issue need to be revoked manually by using the following steps:

Log in to the Hue UI as an administrator.
In the upper right corner of the page, click the user drop-down list and select Manage User:
In the User Admin page, make sure that the Users tab is selected and click the name of the user in the list that you want to edit:
In the Hue Users - Edit user page, click Step 3: Advanced:
Clear the checkbox for Superuser status:
At the bottom of the page, click Update user to save the change.

For the latest update on this issue see the corresponding Knowledge article:

TSB 2019-360: Hue external users granted super user privileges in C6

Spark’s stage retry logic could result in duplicate data

Apache Spark’s retry logic may allow tasks from both a failed output stage attempt and a successful retry attempt to commit output for the same partition.

Products affected: CDS Powered By Apache Spark

Affected versions:

CDS 2.1.0 release 1 and release 2
CDS 2.2.0 release 1 and release 2
CDS 2.3.0 release 2

Fixed versions:

CDH 6.2.0, 6.3.0
CDS 2.1.0 release 3
CDS 2.2.0 release 3
CDS 2.3.0 release 3

For the latest update on this issue see the corresponding Knowledge article: TSB 2019-337-1: Spark’s stage retry logic could result in duplicate data

Spark’s stage retry logic could result in missing data

Apache Spark’s retry logic may allow a task from a failed stage attempt to clean up data from its corresponding task in a successful stage retry attempt..

Products affected: CDS Powered By Apache Spark

Affected versions:

CDS 2.2.0 release 1, release 2
CDS 2.3.0 release 1, release 2

Fixed versions:

CDH 6.2.0, 6.3.0
CDS 2.2.0 release 3
CDS 2.3.0 release 3

For the latest update on this issue see the corresponding Knowledge article: TSB 2019-337-2: Spark’s stage retry logic could result in missing data

Shuffle+Repartition on a DataFrame could lead to incorrect answers

When a repartition follows a shuffle, the assignment of rows to partitions is nondeterministic. If Spark has to recompute a partition, for example, due to an executor failure, the retry can consume a different set of input rows than the original computation. As a result, some rows can be dropped, and others can be duplicated.

Products affected: CDS Powered By Apache Spark

Affected versions:

CDH 6.0.0, 6.0.1, 6.1.0, 6.1.1
CDS 2.1.0 release 1, release 2
CDS 2.2.0 release 1, release 2

Fixed versions:

CDH 6.2.0, 6.3.0
CDS 2.1.0 release 3
CDS 2.2.0 release 3
CDS 2.3.0 release 3

For the latest update on this issue see the corresponding Knowledge article: TSB 2019-337-3: Shuffle+Repartition on a DataFrame could lead to incorrect answers

Shuffle+Repartition on an RDD could lead to incorrect answers

When a repartition follows a shuffle, the assignment of records to partitions is nondeterministic. If Spark has to recompute a partition, for example, due to an executor failure, the retry can consume a different set of input records than the original computation. As a result, some records can be dropped, and others can be duplicated.

Products affected: CDS Powered By Apache Spark

Affected versions:

CDH 6.0.0, 6.0.1, 6.1.0, 6.1.1
CDS 2.1.0 release 1, release 2, release 3
CDS 2.2.0 release 1, release 2, release 3
CDS 2.3.0 release 1, release 2, release 3

Fixed versions:

CDH 6.2.0, 6.3.0
CDS 2.1.0 release 4
CDS 2.2.0 release 4
CDS 2.3.0 release 4

For the latest update on this issue see the corresponding Knowledge article: TSB 2019-337-4: Shuffle+Repartition on an RDD could lead to incorrect answers

Inconsistent rows returned from queries in Kudu

Due to KUDU-2463, upon restarting Kudu, inconsistent rows may be returned from tables that have not recently been written to, resulting in any of the following:

multiple rows for the same key being returned
deleted data being returned
inconsistent results consistently being returned for the same query

If this happens, you have two options to resolve the conflicts: write to the affected Kudu partitions by:

re-deleting the known and deleted data
upserting the most up-to-date version of affected rows.

Products affected: Apache Kudu

Affected version:

CDH 5.12.2, 5.13.3, 5.14.4, 5.15.1, 5.16.1
CDH 6.0.1, 6.1.0, 6.1.1

Fixed version:

CDH 5.16.2
CDH 6.2.0

For the latest update on this issue see the corresponding Knowledge article:TSB 2019-353: Inconsistent rows returned from queries in Kudu

Timestamp type-casted to varchar in a binary predicate can produce incorrect result

In an Impala query the timestamp can be type-casted to a varchar of smaller length to convert a timestamp value to a date string. However, if such Impala query is used in a binary comparison against a string literal, it can produce incorrect results, because of a bug in the expression rewriting code. The following is an example of this:

> select * from (select cast('2018-12-11 09:59:37' as timestamp) as ts) tbl where cast(ts as varchar(10)) = '2018-12-11';

The output will have 0 rows.

Affected version:

CDH 5.15.0, 5.15.1, 5.15.2, 5.16.0, 5.16.1
CDH 6.0.0, 6.0.1, 6.1.0, 6.1.1

Fixed versions:

CDH 5.16.2
CDH 6.2.0

For the latest update on this issue see the corresponding Knowledge article:TSB 2019-358: Timestamp type-casted to varchar in a binary predicate can produce incorrect result

XSS Cloudera Manager

Malicious Impala queries can result in Cross Site Scripting (XSS) when viewed in Cloudera Manager.

Products affected: Apache Impala

Releases affected:

Cloudera Manager 5.13.x, 5.14.x, 5.15.1, 5.15.2, 5.16.1
Cloudera Manager 6.0.0, 6.0.1, 6.1.0

Users affected: All Cloudera Manager Users

Date/time of detection: November 2018

Severity (Low/Medium/High): High

Impact: When a malicious user generates a piece of JavaScript in the impala-shell and then goes to the Queries tab of the Impala service in Cloudera Manager, that piece of JavaScript code gets evaluated, resulting in an XSS.

CVE: CVE-2019-14449

Immediate action required: There is no workaround, upgrade to the latest available maintenance release.

Addressed in release/refresh/patch:

Cloudera Manager 5.16.2
Cloudera Manager 6.0.2, 6.1.1, 6.2.0, 6.3.0

CVE-2018-1296 Permissive Apache Hadoop HDFS listXAttr Authorization Exposes Extended Attribute Key/Value Pairs

AHDFS exposes extended attribute key/value pairs during listXAttrs, verifying only path-level search access to the directory rather than path-level read permission to the referent.

Products affected: Apache HDFS

Releases affected:

CDH 5.4.0 - 5.15.1, 5.16.0
CDH 6.0.0, 6.0.1, 6.1.0

Users affected: Users who store sensitive data in extended attributes, such as users of HDFS encryption.

Date/time of detection: Dcember 12, 2017

Detected by: Rushabh Shah, Yahoo! Inc., Hadoop committer

Severity (Low/Medium/High): Medium

Impact: HDFS exposes extended attribute key/value pairs during listXAttrs, verifying only path-level search access to the directory rather than path-level read permission to the referent. This affects features that store sensitive data in extended attributes.

CVE: CVE-2018-1296

Immediate action required:

Upgrade: Update to a version of CDH containing the fix.
Workaround: If a file contains sensitive data in extended attributes, users and admins need to change the permission to prevent others from listing the directory that contains the file.

Addressed in release/refresh/patch:

CDH 5.15.2, 5.16.1
CDH 6.1.1, 6.2.0

Kafka JMX Tool Cannot Connect to JMX

The Kafka JMX tool cannot connect to the JMX agent of the Kafka Broker or MirrorMaker if the specified address of the JMX remote connector is bound to 127.0.0.1.

Workaround:

In Cloudera Manager go to Kafka > Instances and select the affected broker.
Find the Additional Broker Java Options and Additional MirrorMaker Java Optionsproperties and add the following Java option to the configuration:
```
-Djava.rmi.server.hostname=127.0.0.1
```
Note: Configuring the Additional MirrorMaker Java Options property is only required if you are using JMX with MirrorMaker.
Restart the affected brokers.

Affected Versions: CDH 6.0.0 and higher

Fixed Versions: CDH 6.2.0

Cloudera Issue: OPSAPS-48695

Kafka Broker Fails to Start Due to Slow Sentry and HMS startup

This issue is encountered on cluster startup and is caused by misalignment between Kafka, Sentry, and HMS. The slow startup of HMS slows down Sentry startup which consequently makes the Kafka connection to Sentry time out. Ultimately, the Kafka broker will be unable to start.

Workaround: Manually increase the number of remote procedure call retries between Sentry and Kafka through the Sentry Client Advanced Configuration Snippet (Safety Valve) for sentry-site.xml property.

Go to Sentry > Configuration and find the Sentry Client Advanced Configuration Snippet (Safety Valve) for sentry-site.xml property.
Click on the add button.
Enter the following data:
- Name: sentry.service.client.rpc.retry-total
- Value: 20
Enter a Reason for change, and then click Save Changes to commit the changes.
Return to the Home page by clicking the Cloudera Manager logo.
Click the restart stale services icon next to the Sentry service to invoke the cluster restart wizard.
Click Restart Stale Services.
Click Restart Now.
Click Finish.

Affected Versions: CDH 6.1.0 and higher

Fixed Versions: CDH 6.2.0

Cloudera Issue: CDH-74713

Hadoop LdapGroupsMapping does not support LDAPS for self-signed LDAP server

Hadoop LdapGroupsMapping does not work with LDAP over SSL (LDAPS) if the LDAP server certificate is self-signed. This use case is currently not supported even if Hadoop User Group Mapping LDAP TLS/SSL Enabled, Hadoop User Group Mapping LDAP TLS/SSL Truststore, and Hadoop User Group Mapping LDAP TLS/SSL Truststore Password are filled properly.

Affected Versions: CDH 5.x and 6.0.x versions

Fixed Versions: CDH 6.1.0

Apache Issue: HADOOP-12862

Cloudera Issue: CDH-37926

Upstream Issues Fixed

The following upstream issues are fixed in CDH 6.2.0:

Apache Accumulo
Apache Avro
Apache Crunch
Flume
Hadoop
HBase
Hive
Hue
Impala
Kafka
Kudu
Oozie
Parquet
Apache Pig
Cloudera Search
Sentry
Spark
Sqoop
Zookeeper

Apache Accumulo

There are no notable fixed issues in this release.

Apache Avro

There are no notable fixed issues in this release.

Apache Crunch

There are no notable fixed issues in this release.

Apache Flume

The following issues are fixed in CDH 6.2.0:

FLUME-2050 - Upgrade to Log4j 2.10.0
FLUME-2071 - Flume Context doesn't support float or double configuration values.
FLUME-2464 - Remove hadoop and hbase profiles.
FLUME-2653 - Allow hdfs sink inUseSuffix to be empty
FLUME-2698 - Upgrade Jetty Version
FLUME-2786, FLUME-3056, FLUME-3117 - Application enters a deadlock when stopped while handleConfigurationEvent
FLUME-2799 - Kafka Source - Add message offset to headers
FLUME-2894 - Flume components should stop in the correct order
FLUME-2976 - Exception when JMS source tries to connect to a Weblogic server without authentication
FLUME-2988 - Kafka Sink metrics missing eventDrainAttemptCount
FLUME-2989 - Added 2 KafkaChannel metrics
FLUME-3046 - Kafka Sink and Source Configuration Improvements
FLUME-3087 - Change log level from WARN to INFO
FLUME-3101 - Add maxBatchCount config property to Taildir Source.
FLUME-3115 - Update netty library
FLUME-3133 - Add client IP / hostname headers to Syslog sources.
FLUME-3142 - Adding HBase2 sink
FLUME-3158 - Upgrade surefire version and config
FLUME-3183 - Maven: generate SHA-512 checksum during deploy
FLUME-3186 - Make asyncHbaseClient config parameters available from Flume config
FLUME-3194 - Upgrade derby to the latest version
FLUME-3201 - Fix SyslogUtil to handle RFC3164 format in December correctly
FLUME-3223 - Flume HDFS Sink should retry close prior recover lease
FLUME-3228 - Incorrect parameter name in timestamp interceptor docs
FLUME-3243 - hdfs.callTimeout deafault increased and deprecated
FLUME-3246 - Validate flume configuration to prevent larger source batchsize than
FLUME-3253 - Update jackson-databind dependecy to the latest version
FLUME-3270 - Close JMS resources in JMSMessageConsumer constructor in
FLUME-3281 - Update to Kafka 2.0
FLUME-3282 - Use slf4j in every component
FLUME-3294 - Fix polling logic in TaildirSource
FLUME-3296 - Revert log4j 2 upgrade on 1.9 branch
FLUME-3298 - Make hadoop-common optional in hadoop-credential-store-config-filter
FLUME-3299 - Fix log4j scopes in pom files
FLUME-3302 - Fix issues discovered during the release
FLUME-3314 - Fixed NPE in Kafka source/channel during offset migration

Apache Hadoop

The following issues are fixed in CDH 6.2.0:

HADOOP-9567 - Provide auto-renewal for keytab based logins.
HADOOP-11100 - Support to configure ftpClient.setControlKeepAliveTimeout.
HADOOP-14314 - The OpenSolaris taxonomy link is dead in InterfaceClassification.md.
HADOOP-14970 - MiniHadoopClusterManager does not respect the lack of format option.
HADOOP-15214 - Make Hadoop compatible with Guava 21.0.
HADOOP-15813 - Enable a more reliable SSL connection reuse.
HADOOP-15823 - ABFS: Stop requiring client ID and tenant ID for MSI.
HADOOP-15832 - Upgrade BouncyCastle to 1.60.
HADOOP-15860 - ABFS: Throw IllegalArgumentException when a directory or a file name ends with a period.

HDFS

The following issues are fixed in CDH 6.2.0:

HDFS-12498 - JournalNodeSyncer is not started in a federated HA cluster.
HDFS-12579 - JournalNodeSyncer should use fromUrl field of EditLogManifestResponse to construct the servlet Url.
HDFS-12716 - The 'dfs.datanode.failed.volumes.tolerated' property to support minimum number of volumes that should be available.
HDFS-12886 - Ignore minReplication for block recovery.
HDFS-12946 - Add a tool to check the rack configuration against EC policies.
HDFS-13023 - JournalNodeSyncer does not work on a secure cluster.
HDFS-13626 - Fix incorrect username when the setOwner operation is denied.
HDFS-13744 - OIV tool should better handle control characters present in file or directory names.
HDFS-13761 - Add toString method to the AclFeature class.
HDFS-13818 - Extend OIV to detect FSImage corruption.
HDFS-13996 - Make HttpFS ACLs RegEx-configurable.
HDFS-14008 - NameNode should log the snapshotdiff report.
HDFS-14015 - Improve error handling in hdfsThreadDestructor in the native thread local storage.
HDFS-14027 - DFSStripedOutputStream should implement both the hsync methods.
HDFS-14028 - The HDFS OIV temporary directory deletes a folder.
HDFS-14053 - Provide ability for NameNode to re-replicate based on topology changes.
HDFS-14061 - Check if the cluster topology supports the EC policy before setting, enabling, or adding it.
HDFS-14125 - Use a parameterized log format in ECTopologyVerifier.
HDFS-14140 - JournalNodeSyncer authentication is failing in a secure cluster.
HDFS-14188 - Make the hdfs ec -verifyClusterSetup command accept an EC policy as a parameter.
HDFS-14231 - DataXceiver#run() should not log exceptions caused by InvalidTokenException as an error.

MapReduce 2

The following issues are fixed in CDH 6.2.0:

MAPREDUCE-4669 - MRAM web UI does not work with HTTPS.
MAPREDUCE-7125 - JobResourceUploader creates LocalFileSystem when it's not necessary.

YARN

The following issues are fixed in CDH 6.2.0:

YARN-7396 - NPE when accessing container logs due to null dirsHandler
YARN-8582 - Document YARN support for HTTPS in AM Web server.
YARN-8865 - RMStateStore contains large number of expired RMDelegationToken
YARN-8899 - Fixed minicluster dependency on yarn-server-web-proxy.
YARN-8908 - Fix errors in yarn-default.xml related to GPU/FPGA.
YARN-9087 - Improve logging for initialization of Resource plugins.
YARN-9095 - Removed Unused field from Resource: NUM_MANDATORY_RESOURCES
YARN-9213 - RM Web UI v1 does not show custom resource allocations for containers page
YARN-9318 - Resources#multiplyAndRoundUp does not consider Resource Types
YARN-9322 - Store metrics for custom resource types into FSQueueMetrics and query them in FairSchedulerQueueInfo
YARN-9323 - FSLeafQueue#computeMaxAMResource does not override zero values for custom resources

Apache HBase

The following issues are fixed in CDH 6.2.0:

HBASE-17356 - Add replica get support
HBASE-18735 - Provide an option to kill a MiniHBaseCluster without waiting on shutdown
HBASE-19695 - Handle disabled table for async client
HBASE-19722 - Meta query statistics metrics source
HBASE-20220 - [RSGroup] Check if table exists in the cluster before moving it to the specified regionserver group
HBASE-20604 - ProtobufLogReader#readNext can incorrectly loop to the same position in the stream until the the WAL is rolled
HBASE-20917 - MetaTableMetrics#stop references uninitialized requestsMap for non-meta region
HBASE-21178 - [BC break] : Get and Scan operation with a custom converter_class not working
HBASE-21215 - Figure how to invoke hbck2; make it easy to find
HBASE-21247 - Custom Meta WAL Provider doesn't default to custom WAL Provider whose configuration value is outside the enums in Providers
HBASE-21281 - Upgrade bouncycastle to latest
HBASE-21282 - Upgrade to latest jetty 9.2 and 9.3 versions
HBASE-21297 - ModifyTableProcedure can throw TNDE instead of IOE in case of REGION_REPLICATION change
HBASE-21300 - Fix the wrong reference file path when restoring snapshots for tables with MOB columns
HBASE-21314 - The implementation of BitSetNode is not efficient
HBASE-21321 - HBASE-21278 to branch-2.1 and branch-2.0
HBASE-21322 - Add a scheduleServerCrashProcedure() API to HbckService
HBASE-21336 - Simplify the implementation of WALProcedureMap
HBASE-21338 - Warn if balancer is an ill-fit for cluster size
HBASE-21342 - FileSystem in use may get closed by other bulk load call in secure bulkLoad
HBASE-21345 - [hbck2] Allow version check to proceed even though master is 'initializing'.
HBASE-21349 - Do not run CatalogJanitor or Nomalizer when cluster is shutting down
HBASE-21354 - Procedure may be deleted improperly during master restarts resulting in 'Corrupt'
HBASE-21355 - HStore's storeSize is calculated repeatedly which causing the confusing region split
HBASE-21356 - bulkLoadHFile API should ensure that rs has the source hfile's write permissionls
HBASE-21363 - Rewrite the buildingHoldCleanupTracker method in WALProcedureStore
HBASE-21364 - Procedure holds the lock should put to front of the queue after restart
HBASE-21371 - Hbase unable to compile against Hadoop trunk (3.3.0-SNAPSHOT) due to license error
HBASE-21372 - ) Set hbase.assignment.maximum.attempts to Long.MAX
HBASE-21375 - Revisit the lock and queue implementation in MasterProcedureScheduler
HBASE-21377 - Add debug log for procedure stack id related operations
HBASE-21384 - Procedure with holdlock=false should not be restored lock when restarts
HBASE-21385 - HTable.delete request use rpc call directly instead of AsyncProcess
HBASE-21387 - Race condition surrounding in progress snapshot handling in snapshot cache leads to loss of snapshot files
HBASE-21388 - No need to instantiate MemStoreLAB for master which not carry table
HBASE-21391 - RefreshPeerProcedure should also wait master initialized before executing
HBASE-21395 - Abort split/merge procedure if there is a table procedure of the same table going on
HBASE-21401 - Sanity check when constructing the KeyValue
HBASE-21407 - Resolve NPE in backup Master UI
HBASE-21410 - A helper page that help find all problematic regions and procedures
HBASE-21413 - Empty meta log doesn't get split when restart whole cluster
HBASE-21421 - Do not kill RS if reportOnlineRegions fails
HBASE-21423 - Procedures for meta table/region should be able to execute in separate workers
HBASE-21437 - Bypassed procedure throw IllegalArgumentException when its state is WAITING_TIMEOUT
HBASE-21439 - RegionLoads aren't being used in RegionLoad cost functions
HBASE-21440 - Assign procedure on the crashed server is not properly interrupted
HBASE-21445 - CopyTable by bulkload will write hfile into yarn's HDFS
HBASE-21468 - separate workers for meta table is not working
HBASE-21473 - RowIndexSeekerV1 may return cell with extra two \x00\x00 bytes which has no tags
HBASE-21480 - Taking snapshot when RS crashes prevent we bring the regions online
HBASE-21485 - Add more debug logs for remote procedure execution
HBASE-21490 - WALProcedure may remove proc wal files still with active procedures
HBASE-21492 - CellCodec Written To WAL Before It's Verified
HBASE-21498 - Master OOM when SplitTableRegionProcedure new CacheConfig and instantiate a new BlockCache
HBASE-21511 - Remove in progress snapshot check in SnapshotFileCache#getUnreferencedFiles
HBASE-21524 - Fix logging in ConnectionImplementation.isTableAvailable()
HBASE-21545 - NEW_VERSION_BEHAVIOR breaks Get/Scan with specified columns
HBASE-21551 - Memory leak when use scan with STREAM at server side -
HBASE-21554 - Show replication endpoint classname for replication peer on master web UI
HBASE-21567 - Allow overriding configs starting up the shell
HBASE-21568 - Use CacheConfig.DISABLED where we don't expect to have blockcache running
HBASE-21570 - Add write buffer periodic flush support for AsyncBufferedMutator
HBASE-21580 - Support getting Hbck instance from AsyncConnection
HBASE-21582 - If call HBaseAdmin#snapshotAsync but forget call isSnapshotFinished, then SnapshotHFileCleaner will skip to run every time
HBASE-21590 - Optimize trySkipToNextColumn in StoreScanner a bit.
HBASE-21592 - quota.addGetResult(r) throw NPE
HBASE-21610 - , numOpenConnections metric is set to -1 when zero server channel exist
HBASE-21620 - Problem in scan query when using more than one column prefix filter in some cases
HBASE-21629 - draining_servers.rb is broken
HBASE-21630 - [shell] Define ENDKEY == STOPROW
HBASE-21631 - list_quotas should print human readable values for LIMIT
HBASE-21639 - maxHeapUsage value not read properly from config during EntryBuffers initialization
HBASE-21645 - Perform sanity check and disallow table creation/modification with region replication < 1
HBASE-21662 - Add append_peer_exclude_namespaces and remove_peer_exclude_namespaces shell commands
HBASE-21663 - Add replica scan support
HBASE-21682 - Support getting from specific replica
HBASE-21694 - Add append_peer_exclude_tableCFs and remove_peer_exclude_tableCFs shell commands
HBASE-21704 - The implementation of DistributedHBaseCluster.getServerHoldingRegion is incorrect
HBASE-21705 - Should treat meta table specially for some methods in AsyncAdmin
HBASE-21712 - Make submit-patch.py python3 compatible
HBASE-21732 - Should call toUpperCase before using Enum.valueOf in some methods for ColumnFamilyDescriptor
HBASE-21738 - Remove all the CLSM#size operation in our memstore because it's an quite time consuming.
HBASE-21746 - Fix two concern cases in RegionMover
HBASE-21843 - RegionGroupingProvider breaks the meta wal file name pattern which may cause data loss for meta region
HBASE-21862 - IPCUtil.wrapException should keep the original exception types for all the connection exceptions
HBASE-21915 - Make FileLinkInputStream implement CanUnbuffer
HBASE-21960 - Ensure RESTServletContainer used by RESTServer

Apache Hive

The following issues are fixed in CDH 6.2.0:

Code Changes Might Be Required

The following fixes might require code changes for the CDH 6.2.0 release of Apache Hive:

Code Changes Should Not Be Required

The following fixes should not require code changes, but they contain improvements that might enhance your deployment:

HIVE-15884 - Optimize not between for vectorization
HIVE-16839 - Unbalanced calls to openTransaction/commitTransaction when alter the same partition concurrently
HIVE-18238 - Driver execution may not have configuration changing side-effects
HIVE-18652 - Expose remoteBytesReadToDisk via HoS
HIVE-19564 - Vectorization: Fix NULL / Wrong Results issues in Arithmetic
HIVE-20306 - Implement projection spec for fetching only requested fields from partitions
HIVE-20307 - Add support for filterspec to the getPartitions with projection API
HIVE-20330 - HCatLoader cannot handle multiple InputJobInfo objects for a job with multiple inputs
HIVE-20331 - Query with union all, lateral view and Join fails with "cannot find parent in the child operator"
HIVE-20484 - Disable Block Cache By Default With HBase SerDe
HIVE-20535 - Add new configuration to set the size of the global compile lock
HIVE-20661 - Dynamic partitions loading calls add partition for every partition 1-by-1
HIVE-20722 - Switch HS2 CompileLock to use fair locks
HIVE-20737 - Local SparkContext is shared between user sessions and should be closed only when there is no active
HIVE-20776 - HMS filterHooks on server-side in addition to client-side
HIVE-20796 - jdbc URL can contain sensitive information that should not be logged
HIVE-20818 - Views created with a WHERE subquery will regard views referenced in the subquery as direct input
HIVE-20843 - Properly detect RELY constraint in primary keys and foreign keys
HIVE-20914 - MRScratchDir permission denied when "hive.server2.enable.doAs", "hive.exec.submitviachild" are set to "true" and impersonated/proxy user is used
HIVE-20924 - Property 'hive.driver.parallel.compilation.global.limit' should be immutable at runtime
HIVE-20992 - Split the config "hive.metastore.dbaccess.ssl.properties" into more meaningful configs
HIVE-21015 - HCatLoader can't provide statistics for tables not in default DB
HIVE-21028 - get_table_meta should use a fetch plan to avoid race conditions ending up in NucleusObjectNotFoundException
HIVE-21030 - Add credential store env properties redaction in JobConf
HIVE-21035 - Race condition in SparkUtilities#getSparkSession
HIVE-21044 - Add SLF4J reporter to the metastore metrics systems
HIVE-21035: Add HMS total api count stats and connection pool stats to metrics
HIVE-21077 - Database and Catalogs should have creation time
HIVE-21083 - Remove the requirement to specify the truststore location when TLS to the database is turned on
HIVE-21116 - HADOOP_CREDSTORE_PASSWORD is not populated under yarn.app.mapreduce.am.admin.user.env
HIVE-21320 - Portget_fields() and get_tables_by_type() are not protected by HMS server access control

Hue

The following issues are fixed in CDH 6.2.0:

HUE-7128 - [core] Apply config ENABLE_DOWNLOAD to search dashboard download
HUE-7258 - [jb] Add config check for Spark history server URL
HUE-7919 - oozie error 'NoneType' object has no attribute 'is_superuser'
HUE-8140 - [editor] Stabilize multi-statement execution
HUE-8330 - [core] Multi cluster support of namespaces and compute
HUE-8564 - Avro viewer for File Browser
HUE-8577 - [autocomplete] Update Hive and Impala autocompleter to the latest version
HUE-8584 - [useradmin] Exposing errors for Add Sync Ldap Group
HUE-8585 - [useradmin] Exposing errors for Add Sync Ldap Users
HUE-8587 - Enable queries in Job Browser to work with Smart Connection Pool
HUE-8598 - [autocomplete] Improve autocomplete for CREATE statements
HUE-8605 - [metadata] Only show the Table Privilege tab when Sentry is enabled
HUE-8610 - [tb] The sample call from the Table Browser fails for computes other than default
HUE-8616 - [cluster] getNamespaces for impala returns namespace with hive compute
HUE-8617 - [frontend] Support multi cluster in invalidate metadata
HUE-8638 - [importer] Add autocompletion to query editor in second step of importer
HUE-8641 - [frontend] Trigger a namespace refresh when the context catalog is cleared
HUE-8645 - [assist] Improve namespace listing after cluster creation
HUE-8648 - [importer] Sqoop-configured RDBMS fails
HUE-8649 - [frontend] Add a performance graph component
HUE-8651 - [editor] Add a dedicated execution analysis tab in the editor
HUE-8657 - [frontend] Improve create and configure cluster forms
HUE-8659 - [importer] Fix js exception with the field editor
HUE-8661 - [assist] Enable scrollbars in context popover view sql
HUE-8664 - [importer] Fixed Flume source import properties initialization
HUE-8665 - [editor] Add basic execution analysis for Impala
HUE-8666 - [autocomplete] Fix timing issue with "... ? from table" completion
HUE-8667 - [autocomplete] Fix issue where order by and group by suggestions aren't displayed properly
HUE-8668 - [editor] Add table names to syntax checker suggestions
HUE-8670 - [cluster] Adding auto resize option to the update cluster API
HUE-8679 - [jb] Support query interface in multi cluster node
HUE-8680 - [core] Fill in Impalad WEBUI username passwords automatically if needed
HUE-8681 - [assist] Include unopened topics in the language ref filter
HUE-8682 - [backend] Change PAM lib to python-pam-1.8.4
HUE-8685 - [importer] DB importer always shows DB already exists
HUE-8688 - Update Chinese language code to enable localization
HUE-8690 - Fix Hue allows unsigned SAML assertions
HUE-8691 - [useradmin] Add/sync group does not add users if the objectClass posixGroup already exists in the group LDAP entry
HUE-8692 - [useradmin] Group sync fails if all group members are not found
HUE-8693 - [useradmin] Security app only displays 100 users in the impersonate list
HUE-8694 - [frontend] Fix scroll in the database drop-down menu
HUE-8695 - [importer] Do not show the command but submit when clicking on submit button

Apache Impala

The following issues are fixed in CDH 6.2.0:

IMPALA-341 - Remote profiles are no longer ignored by the coordinator for the queries with the LIMIT clause.
IMPALA-941- Impala supports fully qualified table names that start with a number.
IMPALA-1048 - The query execution summary now includes the total time taken and memory consumed by the data sink at the root of each query fragment.
IMPALA-3323 - Fixed the issue where valid impala-shell options, such as --ldap_password_cmd, were unrecognized when the --config_file option was specified.
IMPALA-5397 - If a query has a dedicated coordinator, its end time is now set when the query releases its admission control resources. With no dedicated coordinator, the end time is set on un-registration.
IMPALA-5474 - Fixed an issue where adding a trivial subquery to a query with an error turns the error into a warning.
IMPALA-6521 - When set, experimental flags are now shown in /varz in web UI and log files.
IMPALA-6900 - INVALIDATE METADATA operation is no longer ignored when HMS is empty.
IMPALA-7446 - Impala enables buffer pool garbage collection when near process memory limit to prevent queries from spilling to disk earlier than necessary.
IMPALA-7659 - In COMPUTE STATS, Impala counts the number of NULL values in a table
IMPALA-7857 - Logs more information about StateStore failure detection.
IMPALA-7928 - To increase the efficiency of the HDFS file handle cache, remote reads for a particular file are scheduled to a consistent set of executor nodes.
IMPALA-7929 - Impala query on tables created via Hive and mapped to HBase failed with an internal exception because the qualifier of the HBase key column is null in the mapped table. Impala relaxed the requirement and allows a NULL qualifier.
IMPALA-7960 - Impala now returns a correct result when comparing TIMESTAMP to a string literal in a binary predicate where the TIMESTAMP is casted to VARCHAR of smaller length.
IMPALA-7961 - Fixed an issue where queries running with the SYNC_DDL query option can fail when the Catalog Server is under a heavy load with concurrent catalog operations of long-running DDLs.
IMPALA-8026 - Impala query profile now reports correct row counts for all nested loop join modes.
IMPALA-8061 - Impala correctly initializes S3_ACCESS_VALIDATED variable to zero when TARGET_FILESYSTEM=3.
IMPALA-8154 - Disabled the Kerberos auth_to_local setting to prevent connection issues between impalads.
IMPALA-8188 - Impala now correctly detects an NVME device name and handles it.
IMPALA-8245 - Added hostname to the timeout error message to enable the user to easily identify the host which has reached a bad connection state with the HDFS NameNode.
IMPALA-8254 - COMPUTE STATS failed if COMPRESSION_CODEC is set.

Apache Kafka

The following issues are fixed in CDH 6.2.0:

KAFKA-3514 - Stream timestamp computation needs some further thoughts.
KAFKA-4932 - Add support for UUID serialization and deserialization
KAFKA-5690 - Add support to list ACLs for a given principal
KAFKA-5975 - No response when deleting topics and delete.topic.enable=false
KAFKA-6082 - Fence zookeeper updates with controller epoch zkVersion
KAFKA-6123 - Give client MetricsReporter auto-generated client.id
KAFKA-6195 - Resolve DNS aliases in bootstrap.server (KIP-235)
KAFKA-6684 - Support casting Connect values with bytes schema to string
KAFKA-6753 - Updating the OfflinePartitions count only when necessary
KAFKA-6835 - Enable topic unclean leader election to be enabled without controller change
KAFKA-6863 - Kafka clients should try to use multiple DNS resolved IP
KAFKA-6914 - Set parent classloader of DelegatingClassLoader same as the worker's
KAFKA-6923 - Refactor Serializer/Deserializer for KIP-336
KAFKA-6926 - Simplified some logic to eliminate some suppressions of NPath complexity checks
KAFKA-6950 - Delay response to failed client authentication to prevent potential DoS issues (KIP-306)
KAFKA-6998 - Disable Caching when max.cache.bytes are zero.
KAFKA-7080 - and KAFKA-7222: Cleanup overlapping KIP changes
KAFKA-7096 - Clear buffered data for partitions that are explicitly unassigned by user
KAFKA-7117 - Support AdminClient API in AclCommand (KIP-332)
KAFKA-7134 - KafkaLog4jAppender exception handling with ignoreExceptions
KAFKA-7139 - Support option to exclude the internal topics in kafka-topics.sh
KAFKA-7196 - Remove heartbeat delayed operation for those removed consumers at the end of each rebalance
KAFKA-7211 - MM should handle TimeoutException in commitSync
KAFKA-7215 - Improve LogCleaner Error Handling
KAFKA-7223 - In-Memory Suppression Buffering
KAFKA-7240 - -total metrics in Streams are incorrect
KAFKA-7277 - Migrate Streams API to Duration instead of longMs times
KAFKA-7299 - Batch LeaderAndIsr requests for AutoLeaderRebalance
KAFKA-7311 - Reset next batch expiry time on each poll loop
KAFKA-7313 - StopReplicaRequest should attempt to remove future replica for the partition only if future replica exists
KAFKA-7324 - NPE due to lack of SASLExtensions in SASL/OAUTHBEARER
KAFKA-7326 - KStream.print() should flush on each line for PrintStream
KAFKA-7332 - Update CORRUPT_MESSAGE exception message description
KAFKA-7333 - Protocol changes for KIP-320
KAFKA-7338 - Specify AES128 default encryption type for Kerberos tests
KAFKA-7366 - Make topic configs segment.bytes and segment.ms to take effect immediately
KAFKA-7379 - [streams] send.buffer.bytes should be allowed to set -1 in KafkaStreams
KAFKA-7394 - OffsetsForLeaderEpoch supports topic describe access
KAFKA-7395 - Add fencing to replication protocol (KIP-320)
KAFKA-7396 - Materialized, Serialized, Joined, Consumed and Produced with implicit Serdes
KAFKA-7399 - KIP-366, Make FunctionConversions deprecated
KAFKA-7400 - Compacted topic segments that precede the log start offse...
KAFKA-7403 - Use default timestamp if no expire timestamp set in offset commit value
KAFKA-7406 - Name join group repartition topics
KAFKA-7409 - Validate message format version before creating topics or altering configs
KAFKA-7415 - Persist leader epoch and start offset on becoming a leader
KAFKA-7428 - ConnectionStressSpec: add "action", allow multiple clients
KAFKA-7429 - Enable key/truststore update with same filename/password
KAFKA-7437 - Persist leader epoch in offset commit metadata
KAFKA-7439 - Replace EasyMock and PowerMock with Mockito in clients module
KAFKA-7441 - Allow LogCleanerManager.resumeCleaning() to be used concurrently
KAFKA-7456 - Serde Inheritance in DSL
KAFKA-7462 - Make token optional for OAuthBearerLoginModule
KAFKA-7464 - catch exceptions in "leaderEndpoint.close()" when shutting down ReplicaFetcherThread
KAFKA-7467 - NoSuchElementException is raised because controlBatch is empty
KAFKA-7475 - capture remote address on connection authetication errors, and log it
KAFKA-7476 - Fix Date-based types in SchemaProjector
KAFKA-7477 - Improve Streams close timeout semantics
KAFKA-7481 - Add upgrade/downgrade notes for 2.1.x
KAFKA-7482 - LeaderAndIsrRequest should be sent to the shutting down broker
KAFKA-7483 - Allow streams to pass headers through Serializer.
KAFKA-7496 - Handle invalid filters gracefully in KafkaAdminClient#describeAcls
KAFKA-7498 - Remove references from `common.requests` to `clients`
KAFKA-7501 - Fix producer batch double deallocation when receiving message too large error on expired batch
KAFKA-7505 - Process incoming bytes on write error to report SSL failures
KAFKA-7519 - Clear pending transaction state when expiration fails
KAFKA-7532 - Clean-up controller log when shutting down brokers
KAFKA-7534 - Error in flush calling close may prevent underlying store from closing
KAFKA-7535 - KafkaConsumer doesn't report records-lag if isolation.level is read_committed
KAFKA-7560 - PushHttpMetricsReporter should not convert metric value to double
KAFKA-7742 - Fixed removing hmac entry for a token being removed from DelegationTokenCache

Apache Kudu

The following issues are fixed in CDH 6.2.0:

The Kudu Python client now detects and reports on conflicting/incorrect initialization of the OpenSSL library to avoid glitches and undefined behavior.
KUDU-1678 - Fixed a crash caused by a race condition between altering tablet schemas and deleting tablet replicas.
KUDU-2680 - Now the kudu fs update_dirs tool can correctly remove directories in the presence of tablet tombstones.
KUDU-2195 - Now you can use the ‑‑cmeta_force_fsync flag to fsync Kudu’s consensus metadata more aggressively. Setting this to truemay decrease Kudu’s performance, but will improve its durability in the face of power failures and forced shutdowns.
KUDU-2684 - Fixed an issue that would cause an excessive amount of RPC traffic from Kudu masters if the tablet servers were configured with duplicated master addresses.
KUDU-2688 - Fixed an issue that would cause the kudu cluster rebalance tool to run indefinitely in the case of tables with a replication factor of 2.
KUDU-2690 - Fixed an issue that could lead to a failure to bootstrap tablet replicas that were a part of workloads with many alter table operations.
KUDU-2710 - Fixed an issue with the Java scanner’s keepAlive that could lead to a permanent hang in the scanner.
KUDU-2706 - Fixed an issue that would cause undefined behavior upon connecting to a secure cluster concurrently from multiple C++ clients.

Apache Oozie

The following issues are fixed in CDH 6.2.0:

OOZIE-1393 - Allow sending emails via TLS
OOZIE-2211 - Remove OozieCLI#validateCommandV41
OOZIE-2339 - [fluent-job] Minimum Viable Fluent Job API
OOZIE-2352 - Unportable shebang in shell scripts
OOZIE-2494 - Cron syntax not handling DST properly
OOZIE-2684 - Bad database schema error for WF_ACTIONS table
OOZIE-2718 - Improve -dryrun for bundles
OOZIE-2791 - ShareLib installation may fail on busy Hadoop clusters
OOZIE-2826 - Upgrade joda-time to 2.9.9
OOZIE-2829 - Improve sharelib upload to accept multiple source folders
OOZIE-2937 - Remove redundant groupId from the child POMs
OOZIE-2942 - [examples] Fix Findbugs warnings
OOZIE-2949 - Fix and backportEscape quotes whitespaces in Sqoop <command> field
OOZIE-3109 - [log-streaming] Escape HTML-specific characters
OOZIE-3134 - Potential inconsistency between the in-memory SLA map and the Oozie database
OOZIE-3155 - [ui] Job DAG is not refreshed when a job is finished
OOZIE-3156 - Retry SSH action check when cannot connect to remote host
OOZIE-3160 - PriorityDelayQueue put()/take() can cause significant CPU load due to busy waiting
OOZIE-3178 - /bin/mkdistro.sh -Papache-release fails due to javadoc errors
OOZIE-3185 - Upgrade org.apache.derby to 10.11.1.1
OOZIE-3193 - Applications are not killed when submitted via subworkflow
OOZIE-3208 - "It should never happen" error messages should be more specific to root cause
OOZIE-3209 - XML schema error when submitting pyspark example
OOZIE-3210 - [build] Revision information is empty
OOZIE-3219 - Cannot compile with hadoop 3.1.0
OOZIE-3224 - Upgrade Jetty to 9.3
OOZIE-3227 - Eliminate duplicate dependencies when using Hadoop 3 DistributedCache
OOZIE-3229 - [client] [ui] Improved SLA filtering options
OOZIE-3233 - Remove DST shift from the coordinator job's end time
OOZIE-3235 - Upgrade ActiveMQ to 5.15.3
OOZIE-3260 - [sla] Remove stale item above max retries on JPA related errors from in-memory SLA map
OOZIE-3278 - Oozie fails to start with Hadoop 2.6.0
OOZIE-3297 - Retry logic does not handle the exception from BulkJPAExecutor properly
OOZIE-3298 - [MapReduce action] External ID is not filled properly and failing MR job is treated as SUCCEEDED
OOZIE-3303 - Oozie UI does not work after Jetty 9.3 upgrade
OOZIE-3304 - Parsing sharelib timestamps is not threadsafe
OOZIE-3307 - [core] Limit heap usage of LauncherAM
OOZIE-3309 - Runtime error during /v2/sla filtering for bundle
OOZIE-3310 - SQL error during /v2/sla filtering
OOZIE-3330 - [spark-action] Remove double quotes inside plain option values
OOZIE-3331 - [spark-action] Inconsistency while parsing quoted Spark options
OOZIE-3334 - Don't use org.apache.hadoop.hbase.security.User in HDFSCredentials
OOZIE-3340 - [fluent-job] Create error handler ACTION only if needed
OOZIE-3348 - [Hive action] Remove dependency hive-contrib
OOZIE-3354 - [core] [SSH action] SSH action gets hung
OOZIE-3369 - [core] Upgrade guru.nidi:graphviz-java to 0.7.0
OOZIE-3370 - Property filtering is not consistent across job submission
OOZIE-3389 - Getting input dependency list on the UI throws NPE
OOZIE-3390 - [Shell action] STDERR contains a bogus error message
OOZIE-3400 - [core] Fix PurgeService sub-sub-workflow checking

Apache Parquet

The following issues are fixed in CDH 6.2.0:

PARQUET-196 - parquet-tools command for row count & size
PARQUET-852 - Slowly ramp up sizes of byte in ByteBasedBitPackingEncoder
PARQUET-969 - Decimal datatype support for parquet-tools output
PARQUET-1336 - PrimitiveComparator should implements Serializable
PARQUET-1407 -Avro: Fix binary values returned from dictionary encoding
PARQUET-1421 - InternalParquetRecordWriter logs debug messages at the INFO level
PARQUET-1440 - Parquet-tools: Parse int32 or int64 decimal values to big decimals with the proper scale
PARQUET-1472 - Parquet-tools: Parse int32 or int64 decimal values to big decimals with the proper scale
PARQUET-1475 - Fix lack of cause propagation in DirectCodecFactory.ParquetCompressionCodecException
PARQUET-1510 - Fix notEq for optional columns with null values
PARQUET-1527 - [parquet-tools] cat command throw java.lang.ClassCastException

Apache Pig

There are no notable fixed issues in this release.

Cloudera Search

The following issues are fixed in CDH 6.2.0:

SOLR-2834 - Handle CharacterFilters in Solr
SOLR-8207 - Collections with underscores in name no longer cause a crash the Cloud->Nodes UI
SOLR-8207 - Add "Nodes" view to the Admin UI "Cloud" tab, listing nodes and key metrics
SOLR-8207 - Nodes view support for shard_1_1_1 format and replica1, replica_1 format. Show core state in label if not 'active'
SOLR-12570 - OpenNLPExtractNamedEntitiesUpdateProcessor cannot support multi fields because pattern replacement doesn't work correctly
SOLR-12597 - Migrate API should fail requests that do not specify split.key parameter
SOLR-12649 - CloudSolrClient retries requests unnecessarily exception from server
SOLR-12670 - RecoveryStrategy logs wrong wait time when retrying recovery
SOLR-12679 - MiniSolrCloudCluster.stopJettySolrRunner should remove jetty from the internal list
SOLR-12679 - MiniSolrCloudCluster.startJettySolrRunner method should not add a duplicate jetty instance to the list
SOLR-12770 - Make it possible to configure a host whitelist for distributed search
SOLR-12776 - Setting of TMP in solr.cmd causes invisibility of Solr to JDK tools

Apache Sentry

The following issues are fixed in CDH 6.2.0:

SENTRY-1797 - SentryKerberosContext should use periodic executor instead of managing periodic execution via run() method.
SENTRY-2329 - Integrate sentry with Hadoop 3.1.1
SENTRY-2372 - SentryStore should not implement grantOptionCheck
SENTRY-2428 - Skip null partitions or partitions with null sds entries
SENTRY-2437 - When granting privileges a single transaction per grant causes long delays
SENTRY-2441 - When MAuthzPathsMapping is deleted all associated MPaths should be deleted automatically..
SENTRY-2477 - When requesting for deltas check if nn seq num is 1 more than latest sequence num
SENTRY-2488 - Add privilege cache to sentry hive bindings in DefaultAccessValidator
SENTRY-2490 - When building a full perm update for each object we only build 1 privilege per role
SENTRY-2492 - Consecutive ALL grants get deleted when multiple roles have ALL grants on that object
SENTRY-2493 - Sentry store api's for path mapping should handle empty/null paths.
SENTRY-2497 - show grant role results should handle case where URI doesn't have a defined scheme.
SENTRY-2498 - Exception while deleting paths that does't exist
SENTRY-2500 - CREATE on server does not provide HMS server side read authorization for get_all_tables(database_name)
SENTRY-2502 - Sentry NN plug-in stops fetching updates from sentry server.
SENTRY-2503 - Failed to revoke the privilege from impala-shell if the privilege added from beeline cli on multi-clusters

Apache Spark

The following issues are fixed in CDH 6.2.0:

SPARK-22148 - [SPARK-15815][SCHEDULER] Acquire new executors to avoid hang because of blacklisting
SPARK-23257 - [K8S] Kerberos Support for Spark on K8S
SPARK-23781 - [CORE] Merge token renewer functionality into HadoopDelegationTokenManager.
SPARK-23831 - Revert "[SQL] Add org.apache.derby to IsolatedClientLoader"
SPARK-24434 - [K8S] pod template files
SPARK-24553 - [UI][FOLLOWUP][2.4 BACKPORT] Fix unnecessary UI redirect
SPARK-24920 - [CORE] Allow sharing Netty's memory pool allocators
SPARK-24958 - [CORE] Add memory from procfs to executor metrics.
SPARK-25003 - [PYSPARK] Use SessionExtensions in Pyspark
SPARK-25023 - Clarify Spark security documentation
SPARK-25118 - [CORE] Persist Driver Logs in Client mode to Hdfs
SPARK-25222 - [K8S] Improve container status logging
SPARK-25451 - [SPARK-26100][CORE] Aggregated metrics table doesn't show the right number of the total tasks
SPARK-25501 - [SS] Add kafka delegation token support.
SPARK-25515 - [K8S] Adds a config option to keep executor pods for debugging
SPARK-25560 - [SQL] Allow FunctionInjection in SparkExtensions
SPARK-25682 - [K8S] Package example jars in same target for dev and distro images.
SPARK-25689 - [CORE] Follow up: don't get delegation tokens when kerberos not available.
SPARK-25689 - [YARN] Make driver, not AM, manage delegation tokens.
SPARK-25730 - [K8S] Delete executor pods from kubernetes after figuring out why they died
SPARK-25745 - [K8S] Improve docker-image-tool.sh script
SPARK-25778 - WriteAheadLogBackedBlockRDD in YARN Cluster Mode Fails ...
SPARK-25786 - [CORE] If the ByteBuffer.hasArray is false , it will throw UnsupportedOperationException for Kryo
SPARK-25815 - [K8S] Support kerberos in client mode, keytab-based token renewal.
SPARK-25828 - [K8S] Bumping Kubernetes-Client version to 4.1.0
SPARK-25837 - [CORE] Fix potential slowdown in AppStatusListener when cleaning up stages
SPARK-25875 - [K8S] Merge code to set up driver command into a single step.
SPARK-25876 - [K8S] Simplify kubernetes configuration types.
SPARK-25877 - [K8S] Move all feature logic to feature classes.
SPARK-25905 - [CORE] When getting a remote block, avoid forcing a conversion to a ChunkedByteBuffer
SPARK-25922 - [K8] Spark Driver/Executor "spark-app-selector" label mismatch
SPARK-25957 - [K8S] Make building alternate language binding docker images optional
SPARK-25960 - [K8S] Support subpath mounting with Kubernetes
SPARK-26002 - [SQL] Fix day of year calculation for Julian calendar days
SPARK-26011 - [SPARK-SUBMIT] Yarn mode pyspark app without python main resource does not honor "spark.jars.packages"
SPARK-26029 - [BUILD][2.4] Bump previousSparkVersion in MimaBuild.scala to be 2.3.0
SPARK-26094 - [CORE][STREAMING] createNonEcFile creates parent dirs.
SPARK-26109 - [WEBUI] Duration in the task summary metrics table and the task table are different
SPARK-26119 - [CORE][WEBUI] Task summary table should contain only successful tasks' metrics
SPARK-26186 - [SPARK-26184][CORE] Last updated time is not getting updated for the Inprogress application
SPARK-26194 - [K8S] Auto generate auth secret for k8s apps.
SPARK-26201 - Fix python broadcast with encryption
SPARK-26219 - [CORE][BRANCH-2.4] Executor summary should get updated for failure jobs in the history server UI
SPARK-26236 - [SS] Add kafka delegation token support documentation.
SPARK-26239 - File-based secret key loading for SASL.
SPARK-26256 - [K8S] Fix labels for pod deletion
SPARK-26267 - [SS] Retry when detecting incorrect offsets from Kafka
SPARK-26304 - [SS] Add default value to spark.kafka.sasl.kerberos.service.name parameter
SPARK-26307 - [SQL] Fix CTAS when INSERT a partitioned table using Hive serde
SPARK-26322 - [SS] Add spark.kafka.sasl.token.mechanism to ease delegation token configuration.
SPARK-26493 - [SQL] Allow multiple spark.sql.extensions
SPARK-26592 - [SS] Throw exception when kafka delegation token tried to obtain with proxy user
SPARK-26595 - [CORE] Allow credential renewal based on kerberos ticket cache.
SPARK-26694 - [CORE] Progress bar should be enabled by default for spark-shell
SPARK-26726 - Synchronize the amount of memory used by the broadcast variable to the UI display
SPARK-26745 - [SPARK-24959][SQL][BRANCH-2.4] Revert count optimization in JSON datasource by
SPARK-26753 - [CORE] Fixed custom log levels for spark-shell by using Filter instead of Threshold
SPARK-26873 - [SQL] Use a consistent timestamp to build Hadoop Job IDs.

Apache Sqoop

The following issues are fixed in CDH 6.2.0:

SQOOP-3237 - Mainframe FTP transfer option to insert custom FTP commands prior to transfer
SQOOP-3382 - Add parquet numeric support for Parquet in hdfs import
SQOOP-3396 - Add parquet numeric support for Parquet in Hive import

Apache Zookeeper

There are no notable fixed issues in this release.

Categories: CDH | Fixed Issues | Release Notes | All Categories

New Features

Unsupported Features