Issues Fixed in CDH 5.10.x

The following topics describe issues fixed in CDH 5.10.x, from newest to oldest release. You can also review What's New in CDH 5.10.x or Known Issues in CDH 5.

Issues Fixed in CDH 5.10.1

Upstream Issues Fixed

The following upstream issues are fixed in CDH 5.10.1:

  • FLUME-2812 - Fix semaphore leak causing java.lang.Error: Maximum permit count exceeded in MemoryChannel.
  • FLUME-2889 - Fixes to DateTime computations.
  • FLUME-2889 - Fixes to DateTime computations.
  • FLUME-2999 - Kafka channel and sink should enable statically assigned partition per event via header.
  • FLUME-3027 - Change Kafka Channel to clear offsets map after commit.
  • FLUME-3049 - Make HDFS sink rotate more reliably in secure mode.
  • HADOOP-11619 - FTPFileSystem should override getDefaultPort.
  • HADOOP-12655 - TestHttpServer.testBindAddress bind port range is wider than expected.
  • HADOOP-13433 - Race in UGI.reloginFromKeytab.
  • HADOOP-13627 - Have an explicit KerberosAuthException for UGI to throw, text from public constants.
  • HADOOP-13805 - UGI.getCurrentUser() fails if user does not have a keytab associated.
  • HADOOP-13903 - Improvements to KMS logging to help debug authorization errors.
  • HADOOP-13911 - Remove TRUSTSTORE_PASSWORD related scripts from KMS.
  • HADOOP-13953 - Make FTPFileSystem's data connection mode and transfer mode configurable.
  • HADOOP-14003 - Make additional KMS tomcat settings configurable.
  • HADOOP-14114 - S3A can no longer handle unencoded + in URIs.
  • HDFS-11160 - VolumeScanner reports write-in-progress replicas as corrupt incorrectly.
  • HDFS-11275 - Check groupEntryIndex and throw a helpful exception on failures when removing ACL.
  • HDFS-11292 - log lastWrittenTxId etc info in logSyncAll.
  • HDFS-11306 - Print remaining edit logs from buffer if edit log can't be rolled.
  • HDFS-11363 - Need more diagnosis info when seeing Slow waitForAckedSeqno.
  • HDFS-11379 - DFSInputStream may infinite loop requesting block locations.
  • MAPREDUCE-5155 - Race condition in test case TestFetchFailure cause it to fail.
  • MAPREDUCE-6172 - TestDbClasses timeouts are too aggressive.
  • MAPREDUCE-6571 - JobEndNotification info logs are missing in AM container syslog.
  • MAPREDUCE-6817 - The format of job start time in JHS is different from those of submit and finish time.
  • YARN-2306 - Add test for leakage of reservation metrics in fair scheduler.
  • YARN-2336 - Fair scheduler's REST API returns a missing '[' bracket JSON for deep queue tree.
  • YARN-3269 - Yarn.nodemanager.remote-app-log-dir could not be configured to fully qualified path.
  • YARN-3933 - FairScheduler: Multiple calls to completedContainer are not safe.
  • YARN-3957 - FairScheduler NPE In FairSchedulerQueueInfo causing scheduler page to return 500.
  • YARN-4363 - In TestFairScheduler, testcase should not create FairScheduler redundantly.
  • YARN-4544 - All the log messages about rolling monitoring interval are shown with WARN level.
  • YARN-4555 - TestDefaultContainerExecutor#testContainerLaunchError fails on non-english locale environment.
  • YARN-5136 - Error in handling event type APP_ATTEMPT_REMOVED to the scheduler.
  • YARN-5308 - FairScheduler: Move continuous scheduling related tests to TestContinuousScheduling.
  • YARN-5752 - TestLocalResourcesTrackerImpl#testLocalResourceCache times out.
  • YARN-5859 - TestResourceLocalizationService#testParallelDownloadAttemptsForPublicResource sometimes fails.
  • YARN-5890 - FairScheduler should log information about AM-resource-usage and max-AM-share for queues.
  • YARN-5920 - Fix deadlock in TestRMHA.testTransitionedToStandbyShouldNotHang,
  • YARN-6151 - FS preemption does not consider child queues over fairshare if the parent is under.
  • YARN-6175 - FairScheduler: Negative vcore for resource needed to preempt.
  • HBASE-12949 - Scanner can be stuck in infinite loop if the HFile is corrupted.
  • HBASE-15125 - BackportHBaseFsck's adoptHdfsOrphan function creates region with wrong end key boundary.
  • HBASE-15328 - sanity check the redirect used to send master info requests to the embedded regionserver.
  • HBASE-15378 - Scanner cannot handle heartbeat message with no results.
  • HBASE-15587 - FSTableDescriptors.getDescriptor() logs stack trace erronously.
  • HBASE-15931 - Add log for long-running tasks in AsyncProcess HBASE-16289 AsyncProcess stuck messages need to print region/server.
  • HBASE-15955 - Disable action in CatalogJanitor#setEnabled should wait for active cleanup scan to finish.
  • HBASE-16032 - Possible memory leak in StoreScanner.
  • HBASE-16062 - Improper error handling in WAL Reader/Writer creation.
  • HBASE-16237 - Blocks for hbase:meta table are not cached in L1 cache.
  • HBASE-16238 - It's useless to catch SESSIONEXPIRED exception and retry in RecoverableZooKeeper.
  • HBASE-16266 - Do not throw ScannerTimeoutException when catch UnknownScannerException.
  • HBASE-16304 - HRegion#RegionScannerImpl#handleFileNotFoundException may lead to deadlock when trying to obtain write lock on updatesLock.
  • HBASE-16429 - FSHLog: deadlock if rollWriter called when ring buffer filled with appends.
  • HBASE-16460 - Can't rebuild the BucketAllocator's data structures when BucketCache uses FileIOEngine.
  • HBASE-16604 - Scanner retries on IOException can cause the scans to miss data.
  • HBASE-16649 - Truncate table with splits preserved can cause both data loss and truncated data appeared again.
  • HBASE-16662 - Fix open POODLE vulnerabilities.
  • HBASE-16721 - Concurrency issue in WAL unflushed seqId tracking.
  • HBASE-16807 - RegionServer will fail to report new active Hmaster until HMaster/RegionServer failover.
  • HBASE-16841 - Data loss in MOB files after cloning a snapshot and deleting that snapshot.
  • HBASE-16960 - RegionServer hang when aborting.
  • HBASE-17020 - keylen in midkey() dont computed correctly.
  • HBASE-17023 - Region left unassigned due to AM and SSH each thinking others would do the assignment work.
  • HBASE-17044 - Fix merge failed before creating merged region leaves meta inconsistent.
  • HBASE-17069 - RegionServer writes invalid META entries for split daughters in some circumstances.
  • HBASE-17206 - FSHLog may roll a new writer successfully with unflushed entries.
  • HBASE-17241 - Avoid compacting already compacted mob files with _del files.
  • HBASE-17265 - Region left unassigned in master failover when region failed to open.
  • HBASE-17275 - Assign timeout may cause region to be unassigned forever.
  • HBASE-17328 - Properly dispose of looped replication peers.
  • HBASE-17381 - ReplicationSourceWorkerThread can die due to unhandled exceptions.
  • HBASE-17409 - Limit jsonp callback name to prevent xss.
  • HBASE-17452 - Failed taking snapshot - region Manifest proto-message too large.
  • HBASE-17522 - Handle JVM throwing runtime exceptions when we ask for details on heap usage the same as a correctly returned 'undefined'.
  • HBASE-17558 - ZK dumping jsp should escape HTML.
  • HBASE-17561 - table status page should escape values that may contain arbitrary characters.
  • HBASE-17675 - ReplicationEndpoint should choose new sinks if a SaslException occurs.
  • HIVE-7723 - Explain plan for complex query with lots of partitions is slow due to in-efficient collection used to find a matching ReadEntity.
  • HIVE-11594 - Analyze Table for column names with embedded spaces.
  • HIVE-11849 - NPE in HiveHBaseTableShapshotInputFormat in query with just count(*).
  • HIVE-12349 - part of the patch which solves IS NULL queries can trigger NPE for timestamp and date columns.
  • HIVE-12465 - Hive might produce wrong results when (outer) joins are merged.
  • HIVE-12780 - Fix the output of the history command in Beeline HIVE-12789: Fix output twice in the history command of Beeline.
  • HIVE-12976 - MetaStoreDirectSql doesn't batch IN lists in all cases.
  • HIVE-13149 - Remove some unnecessary HMS connections from HS2.
  • HIVE-13240 - GroupByOperator: Drop the hash aggregates when closing operator.
  • HIVE-13864 - Beeline ignores the command that follows a semicolon and comment.
  • HIVE-13866 - flatten callstack for directSQL errors.
  • HIVE-13895 - HoS start-up overhead in yarn-client mode.
  • HIVE-14693 - Some partitions will be left out when partition number is the multiple of the option hive.msck.repair.batch.size.
  • HIVE-14764 - Enabling "hive.metastore.metrics.enabled" throws OOM in HiveMetastore.
  • HIVE-14820 - RPC server for spark inside HS2 is not getting server address properly.
  • HIVE-15338 - Wrong result from non-vectorized DATEDIFF with scalar parameter of type DATE/TIMESTAMP.
  • HIVE-15346 - "values temp table" should not be an input.
  • HIVE-15359 - skip.footer.line.count doesnt work properly for certain situations.
  • HIVE-15410 - WebHCat supports get/set table property with its name containing period and hyphen.
  • HIVE-15485 - Addendum toInvestigate the DoAs failure in HoS.
  • HIVE-15485 - Investigate the DoAs failure in HoS.
  • HIVE-15517 - NOT (x <=> y) returns NULL if x or y is NULL.
  • HIVE-15551 - memory leak in directsql for mysql+bonecp specific initialization.
  • HIVE-15735 - In some cases, view objects inside a view do not have parents.
  • HIVE-15782 - query on parquet table returns incorrect result when hive.optimize.index.filter is set to true.
  • HIVE-15872 - The PERCENTILE_APPROX UDAF does not work with empty set.
  • HIVE-16019 - Query fails when group by/order by on same column with uppercase name.
  • HIVE-16047 - Shouldn't try to get KeyProvider unless encryption is enabled.
  • HUE-4969 - [core] Rename ini properties for sasl buffer to be standard.
  • HUE-5310 - [search] Use Doc2 modal in search_controller.
  • HUE-5408 - [oozie] Support old docs while saving shared workflow.
  • HUE-5476 - [core] Fix TTL is_idle middleware check.
  • HUE-5482 - [home] Handle multiple home/trash directories by merging them into one.
  • HUE-5533 - [home] Improve home page load time
  • HUE-5552 - [editor] Japanese improvement for new SQL Editor #466
  • HUE-5602 - [jb] Add start time filter in jobs page
  • HUE-5604 - [core] Update localization for de, es, fr, ja, ko, zh
  • HUE-5670 - [doc2] Prevent exception when doc2 object is not linked to doc1
  • HUE-5717 - [backend] Some operating system incorrectly detect javascript mime-type as text/x-js instead of application/javascript
  • HUE-5722 - [core] Avoid query redaction when string is None
  • HUE-5742 - [core] Allow user to provide schema name for database via ini
  • HUE-5756 - [doc2] Workaround for improving the query history search time
  • HUE-5758 - [oozie] Fix parsing nodes from XML definition
  • HUE-5769 - [oozie] Remove mandatory inclusion of Kill row in the workflow dashobard graph
  • HUE-5823 - [editor] Cancel running doc search requests when the query has changed
  • HUE-5850 - [sentry] Prevent creating roles with empty names
  • HUE-5958 - [pig] Fix unicode errors when handling exceptions
  • HUE-5962 - [hiveserver2] Update HiveServerClient user object when opening session
  • IMPALA-2605 - Omit the sort and mini stress tests.
  • IMPALA-4055 - Speed up to_date() with custom implementation.
  • IMPALA-4263 - Fix wrong ommission of agg/analytic hash exchanges.
  • IMPALA-4282 - Remove max length check for type strings.
  • IMPALA-4449 - Revisit table locking pattern in the catalog.
  • IMPALA-4675 - Case-insensitive matching of Parquet fields.
  • IMPALA-4702 - Fix command line help for webserver_private_key_file.
  • IMPALA-4705, IMPALA-4779, IMPALA-4780 - Fix some Expr bugs with codegen.
  • IMPALA-4742 - Change "{}".format() to "{0}".format() for Py 2.6.
  • IMPALA-4749 - hit DCHECK in sorter with scratch limit.
  • IMPALA-4767 - Workaround for HIVE-15653 to preserve table stats.
  • IMPALA-4808 - old hash join can reference invalid memory.
  • IMPALA-4828 - Alter Kudu schema outside Impala may crash on read.
  • IMPALA-4854 - Fix incremental stats with complex types.
  • IMPALA-4916 - Fix maintenance of set of item sets in DisjointSet.
  • IMPALA-4981 - Re-enable spilling with MT_DOP.
  • IMPALA-4995 - Fix integer overflow in TopNNode::PrepareForOutput.
  • IMPALA-4997 - Fix overflows in Sorter::TupleIterator.
  • OOZIE-2243 - Kill Command does not kill the child job for java action.
  • OOZIE-2519 - Oozie HA with SSL info is slightly incorrect.
  • OOZIE-2584 - Eliminate Thread.sleep() calls in TestMemoryLocks.
  • OOZIE-2742 - Unable to kill applications based on tag.
  • OOZIE-2748 - NPE in LauncherMapper.printArgs().
  • OOZIE-2757 - Malformed xml is Spark action doc page.
  • OOZIE-2777 - Config-default.xml longer than 64k results in java.io.UTFDataFormatException.
  • OOZIE-2787 - Oozie distributes application jar twice making the spark job fail.
  • OOZIE-2802 - Spark action failure on Spark 2.1.0 due to duplicate sharelibs.
  • SENTRY-1508 - MetastorePlugin.java does not handle properly initialization failure.
  • SENTRY-1520 - Provide mechanism for triggering HMS full snapshot.
  • SENTRY-1564 - Improve error detection and reporting in MetastoreCacheInitializer.java.
  • SENTRY-1605 - 'SENTRY-1508 need to be fixed because of Kerberos initialization issue.
  • SOLR-9284 - The HDFS BlockDirectoryCache should not let it's keysToRelease or names maps grow indefinitely.
  • SOLR-9330 - Fix AlreadyClosedException on admin/mbeans?stats=true.
  • SOLR-9699, SOLR-4668 - fix exception from core status in parallel with core reload.
  • SOLR-9819 - Upgrade Apache commons-fileupload to 1.3.2, fixing a security vulnerability.
  • SOLR-9859 - backport of replication.properties cannot be updated after being written and neither replication.properties or index.properties are durable in the face of a crash. Don't log error on NoSuchFileException.
  • SOLR-9901 - backport of, SOLR-9899Implement move in HdfsDirectoryFactory. SOLR-9899: StandardDirectoryFactory should use optimizations for all FilterDirectorys not just NRTCachingDirectory.
  • SOLR-10031 - Validation of filename params in ReplicationHandler.
  • SOLR-10114, SOLR-9941 - Reordered delete-by-query can delete or omit child documents
  • SOLR-10119 - TestReplicationHandler assertion fixes part of.
  • SOLR-10121, SOLR-10116 - BlockCache corruption with high concurrency.
  • SPARK-12241 - [YARN] Improve failure reporting in Yarn client obtainTokenForHBase().
  • SPARK-12523 - [YARN] Support long-running of the Spark On HBase and hive meta store.
  • SPARK-18750 - [YARN] Follow up: move test to correct directory in 2.1 branch.
  • SPARK-18750 - [YARN] Avoid using "mapValues" when allocating containers.
  • SQOOP-2349 - Add command line option for setting transaction isolation levels for metadata queries.
  • SQOOP-2896 - Sqoop exec job fails with SQLException Access denied for user.
  • SQOOP-2909 - Oracle related ImportTest fails after SQOOP-2737.
  • SQOOP-2911 - Fix failing HCatalogExportTest caused by SQOOP-2863.
  • SQOOP-2950 - Sqoop trunk has consistent UT failures - need fixing.
  • SQOOP-3053 - Create a cmd line argument for sqoop.throwOnError and use it through SqoopOptions.
  • SQOOP-3055 - Fixing MySQL tests failing due to ignored test inputs/configuration.
  • SQOOP-3057 - Fixing 3rd party Oracle tests failing due to invalid case of column names.
  • SQOOP-3068 - Enhance error (tool.ImportTool: Encountered IOException running import job: java.io.IOException: Expected schema) to suggest workaround.
  • SQOOP-3071 - Fix OracleManager to apply localTimeZone correctly in case of Date objects too
  • SQOOP-3072 - Reenable escaping in ImportTest#testProductWithWhiteSpaceImport for proper execution.
  • SQOOP-3081 - use OracleEscapeUtils.escapeIdentifier in OracleUpsertOutputFormat instead of inline appending quotes.
  • SQOOP-3124 - Fix ordering in column list query of PostgreSQL connector to reflect the logical order instead of adhoc ordering.

Issues Fixed in CDH 5.10.0

CDH 5.10.0 fixes the following issues.

Hadoop Common

Have Hadoop use commons-daemon version from root pom

Apache Commons Daemon 1.0.3 has been removed in favor of Apache Commons Daemon 1.0.13.

HBase

HBase shell commands correctly return missing namespace name

In prior releases, passing a non-existent namespace with GRANT and REVOKE commands to the HBase shell returned an error message displaying the user name, rather than the name of the namespace. In this release, the error message correctly displays the missing namespace name. This change is a result of an underlying change to the error handling for the HBase shell; specifically, the boolean namespace_exists? method now returns false when the namespace is not found, rather than the NamespaceNotFoundException.

HDFS

fsck should also report decommissioning replicas

Bug: HDFS-7933

The output of HDFS fsck now also contains information about decommissioning replicas.

du reports false used space after appending to snapshotted files and deleting them

Bug: HDFS-7933

Disk usage summaries previously incorrectly counted files twice if they had been renamed (including files moved to Trash) since the last snapshot. Summaries now include current data plus snapshot data that is no longer under the directory either due to deletion or being moved outside of the directory.

ACL inheritance conflicts with umaskmode

Bug: HDFS-6962

Previously, HDFS ACLs applied the client umask to the permissions when inheriting default ACLs from the parent directory. This differs from the POSIX ACL specification. HDFS can now ignore the client umask to comply with the POSIX ACL specification.

Because this is a backward-incompatible change, this behavior is disabled by default. To enable it, set dfs.namenode.posix.acl.inheritance.enabled to true in hdfs-site.xml.

Hue

Security and Administration

  • HUE-4541 - Support Kerberos mutual authentication across HUE

  • HUE-4969 - Support HiveServer2 + SASL (hive.server2.thrift.sasl.qop=”auth-conf”)

  • HUE-4372 - Turn off HSTS header in Hue Load Balancer (server only)
  • HUE-3079 - Display YARN jobs and logs from JHS when not in RM

  • Support Oracle DB connectivity for CDH package installations

SQL Editor and General UX

  • HUE-5070 - Stabilize Import documents
  • HUE-4032 - Improve Sample Popup performance
  • HUE-4073 - Optimize SQL Assist rendering for large scrolls
  • HUE-4039 - Refine Autocompleter results
  • HUE-4726 - Improve viewing Parquet formatted data

Upstream Issues Fixed

The following upstream issues are fixed in CDH 5.10.0:

  • AVRO-1584 - Java: Escape characters not allowed in JSON in toString
  • AVRO-1642 - Java: Do not generate invalid all-args constructor
  • AVRO-1684 - Add time types to the specific compiler
  • AVRO-1799 - Fix GenericRecord#toString ByteBuffer bug
  • AVRO-1847 - IDL compiler should use BigDecimal to represent decimal logical type
  • AVRO-1869 - Java: Fix Decimal conversion from ByteBuffer
  • AVRO-1877 - Restore correct javaUnbox in Specific Compiler
  • AVRO-1895 - Fix deepCopy() to get correct logical type conversion
  • AVRO-1895 - Java: Fix GenericData#deepCopy() to support logical types
  • CRUNCH-616 - Replace (possibly copyrighted) Maugham text with Dickens
  • FLUME-2171 - Add Interceptor to remove headers from event
  • FLUME-2997 - Fix flaky test in SpillableMemoryChannel
  • FLUME-3002 - Fix tests in TestBucketWriter
  • FLUME-3003 - Fix flaky testSourceCounter in TestSyslogUdpSource
  • FLUME-3025 - Expose FileChannel.open on JMX
  • FLUME-3031 - Change sequence source to reset its counter for event body on channel exception
  • HADOOP-3733 - "s3x:" URLs break when Secret Key contains a slash, even if encoded
  • HADOOP-7930 - Kerberos relogin interval in UserGroupInformation should be configurable
  • HADOOP-10062 - Race condition in MetricsSystemImpl#publishMetricsNow that causes incorrect results
  • HADOOP-11381 - Fix findbugs warnings in hadoop-distcp, hadoop-aws, hadoop-azure, and hadoop-openstack
  • HADOOP-11412 - POMs mention "The Apache Software License" rather than "Apache License"
  • HADOOP-11520 - Clean incomplete multi-part uploads in S3A tests
  • HADOOP-11720 - [JDK8] Fix javadoc errors caused by incorrect or illegal tags in hadoop-tools
  • HADOOP-11780 - Prevent IPC reader thread death
  • HADOOP-11922 - Misspelling of threshold in log4j.properties for tests in hadoop-tools
  • HADOOP-12169 - ListStatus on empty dir in S3A lists itself instead of returning an empty list
  • HADOOP-12169 - ListStatus on empty dir in S3A lists itself instead of returning an empty list
  • HADOOP-12292 - Make use of DeleteObjects optional
  • HADOOP-12325 - RPC Metrics : Add the ability track and log slow RPCs
  • HADOOP-12444 - Support lazy seek in S3AInputStream
  • HADOOP-12597 - In kms-site.xml configuration hadoop.security.keystore.JavaKeyStoreProvider.password should be updated with new name
  • HADOOP-12696 - Add tests for S3FileSystem Contract
  • HADOOP-12801 - Suppress obsolete S3FileSystem tests
  • HADOOP-12807 - S3AFileSystem should read AWS credentials from environment variables
  • HADOOP-12846 - Credential Provider Recursive Dependencies
  • HADOOP-12851 - S3AFileSystem Uptake of ProviderUtils.excludeIncompatibleCredentialProviders
  • HADOOP-12891 - S3AFileSystem should configure Multipart Copy threshold and chunk size
  • HADOOP-12994 - Specify PositionedReadable, add contract tests, fix problems
  • HADOOP-13028 - Add low level counter metrics for S3A; use in read performance tests
  • HADOOP-13056 - Print expected values when rejecting a server's determined principal
  • HADOOP-13065 - Add a new interface for retrieving FS and FC Statistics
  • HADOOP-13113 - Enable parallel test execution for hadoop-aws
  • HADOOP-13116 - Jets3tNativeS3FileSystemContractTest does not run
  • HADOOP-13122 - Customize User-Agent header sent in HTTP requests by S3A
  • HADOOP-13130 - s3a failures can surface as RTEs, not IOEs
  • HADOOP-13131 - Add tests to verify that S3A supports SSE-S3 encryption
  • HADOOP-13139 - Branch-2: S3a to use thread pool that blocks clients
  • HADOOP-13145 - In DistCp, prevent unnecessary getFileStatus call when not preserving metadata
  • HADOOP-13158 - S3AFileSystem#toString might throw NullPointerException due to null cannedACL
  • HADOOP-13162 - Consider reducing number of getFileStatus calls in S3AFileSystem.mkdirs
  • HADOOP-13171 - Add StorageStatistics to S3A; instrument some more operations
  • HADOOP-13183 - S3A proxy tests fail after httpclient/httpcore upgrade
  • HADOOP-13188 - S3A file-create should throw error rather than overwrite directories
  • HADOOP-13201 - Print the directory paths when ViewFs denies the rename operation on internal dirs
  • HADOOP-13203 - S3A: Support fadvise "random" mode for high performance readPositioned() reads
  • HADOOP-13208 - S3A listFiles(recursive=true) to do a bulk listObjects instead of walking the pseudo-tree of directories
  • HADOOP-13212 - Provide an option to set the socket buffers in S3AFileSystem
  • HADOOP-13237 - s3a initialization against public bucket fails if caller lacks any credentials
  • HADOOP-13239 - Deprecate s3:// in branch-2
  • HADOOP-13241 - Document s3a better
  • HADOOP-13252 - Tune S3A provider plugin mechanism
  • HADOOP-13280 - FileSystemStorageStatistics#getLong(“readOps“) should return readOps + largeReadOps
  • HADOOP-13283 - Support reset operation for new global storage statistics and per FS storage stats
  • HADOOP-13284 - FileSystemStorageStatistics must not attempt to read non-existent rack-aware read stats in branch-2.8
  • HADOOP-13287 - TestS3ACredentials#testInstantiateFromURL fails if AWS secret key contains +
  • HADOOP-13288 - Guard null stats key in FileSystemStorageStatistics
  • HADOOP-13291 - Probing stats in DFSOpsCountStatistics/S3AStorageStatistics should be correctly implemented
  • HADOOP-13305 - Define common statistics names across schemes
  • HADOOP-13324 - s3a tests don't authenticate with S3 frankfurt (or other V4 auth only endpoints)
  • HADOOP-13368 - DFSOpsCountStatistics$OpType#fromSymbol and s3a.Statistic#fromSymbol should be O(1) operation
  • HADOOP-13387 - Users always get told off for using S3 —even when not using it
  • HADOOP-13389 - TestS3ATemporaryCredentials.testSTS error when using IAM credentials
  • HADOOP-13396 - Allow pluggable audit loggers in KMS
  • HADOOP-13405 - Doc for fs.s3a.acl.default indicates incorrect values
  • HADOOP-13406 - S3AFileSystem: Consider reusing filestatus in delete() and mkdirs()
  • HADOOP-13442 - Optimize UGI group lookups
  • HADOOP-13447 - Refactor S3AFileSystem to support introduction of separate metadata repository and tests
  • HADOOP-13512 - ReloadingX509TrustManager should keep reloading in case of exception.
  • HADOOP-13541 - Explicitly declare the Joda time version S3A depends on
  • HADOOP-13590 - Retry until TGT expires even if the UGI renewal thread encountered exception
  • HADOOP-13601 - Fix a log message typo in AbstractDelegationTokenSecretManager
  • HADOOP-13641 - Update UGI#spawnAutoRenewalThreadForUserCreds to reduce indentation
  • HADOOP-13684 - Snappy may complain Hadoop is built without snappy if libhadoop is not found
  • HADOOP-13698 - Document caveat for KeyShell when underlying KeyProvider does not delete a key
  • HADOOP-13838 - KMSTokenRenewer should close providers
  • HADOOP-13864 - KMS should not require truststore password
  • HDFS-742 - A down DataNode makes Balancer to hang on repeatingly asking NameNode its partial block list
  • HDFS-2390 - dfsadmin -setBalancerBandwidth does not validate -ve value
  • HDFS-4396 - Add START_MSG/SHUTDOWN_MSG for ZKFC
  • HDFS-6565 - Use jackson instead jetty json in hdfs-client
  • HDFS-7224 - Allow reuse of NN connections via webhdfs
  • HDFS-7384 - getfacl command and getAclStatus output should be in sync
  • HDFS-7411 - Change decommission logic to throttle by blocks rather than nodes in each interval
  • HDFS-7537 - Add "UNDER MIN REPL'D BLOCKS" count to fsck
  • HDFS-7933 - fsck should also report decommissioning replicas
  • HDFS-8037 - CheckAccess in WebHDFS silently accepts malformed FsActions parameters
  • HDFS-8039 - Fix TestDebugAdmin#testRecoverLease and testVerfiyBlockChecksumCommand on Windows
  • HDFS-8405 - Fix a typo in NamenodeFsck
  • HDFS-8542 - WebHDFS getHomeDirectory behavior does not match specification
  • HDFS-8721 - Add a metric for number of encryption zones
  • HDFS-8826 - In Balancer, add an option to specify the source node list so that balancer only selects blocks to move from those nodes
  • HDFS-8923 - Add -source flag to balancer usage message
  • HDFS-8986 - Add option to -du to calculate directory space usage excluding snapshots
  • HDFS-9005 - Provide support for upgrade domain script
  • HDFS-9019 - Adding informative message to sticky bit permission denied exception
  • HDFS-9063 - Correctly handle snapshot path for getContentSummary
  • HDFS-9214 - Add missing license header
  • HDFS-9214 - Support reconfiguring dfs.datanode.balance.max.concurrent.moves without DN restart
  • HDFS-9223 - Code cleanup for DatanodeDescriptor and HeartbeatManager
  • HDFS-9257 - Improve error message for "Absolute path required" in INode.java to contain the rejected path
  • HDFS-9279 - Decomissioned capacity should not be considered for configured/used capacity
  • HDFS-9389 - Add maintenance states to AdminStates
  • HDFS-9392 - Admins support for maintenance state
  • HDFS-9444 - Add utility to find set of available ephemeral ports to ServerSocketUtil
  • HDFS-9500 - Fix software version counts for DataNodes during rolling upgrade
  • HDFS-9724 - Degraded performance in WebHDFS listing as it does not reuse ObjectMapper
  • HDFS-9745 - TestSecureNNWithQJM#testSecureMode sometimes fails with timeouts
  • HDFS-9790 - HDFS Balancer should exit with a proper message if upgrade is not finalized
  • HDFS-9839 - Reduce verbosity of processReport logging
  • HDFS-9885 - Correct the distcp counters name while displaying counters
  • HDFS-9926 - MiniDFSCluster leaks dependency Mockito via DataNodeTestUtils
  • HDFS-9934 - ReverseXML oiv processor should bail out if the XML file's layoutVersion doesn't match oiv's
  • HDFS-9951 - Use string constants for XML tags in OfflineImageReconstructor
  • HDFS-10276 - HDFS should not expose path info that user has no permission to see
  • HDFS-10277 - PositionedReadable test testReadFullyZeroByteFile failing in HDFS
  • HDFS-10291 - TestShortCircuitLocalRead failing
  • HDFS-10415 - TestDistributedFileSystem#MyDistributedFileSystem attempts to set up statistics before initialize() is called
  • HDFS-10423 - Increase default value of httpfs maxHttpHeaderSize
  • HDFS-10505 - OIV's ReverseXML processor should support ACLs
  • HDFS-10549 - Correctly revoke file leases when closing files
  • HDFS-10553 - DiskBalancer: Rename Tools/DiskBalancer class to Tools/DiskBalancerCLI
  • HDFS-10599 - DiskBalancer: Execute CLI via Shell
  • HDFS-10627 - Volume Scanner marks a block as "suspect" even if the exception is network-related
  • HDFS-10628 - Log HDFS Balancer exit message to its own log
  • HDFS-10655 - Fix path related byte array conversion bugs
  • HDFS-10656 - Optimize conversion of byte arrays back to path string
  • HDFS-10674 - Optimize creating a full path from an inode
  • HDFS-10694 - processReport() should print blockReportId in each log message
  • HDFS-10738 - MR1Fix TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration test failure
  • HDFS-10747 - o.a.h.hdfs.tools.DebugAdmin usage message is misleading
  • HDFS-10756 - Expose getTrashRoot to HTTPFS and WebHDFS
  • HDFS-10763 - Open files can leak permanently due to inconsistent lease update
  • HDFS-10784 - Implement WebHdfsFileSystem#listStatusIterator
  • HDFS-10797 - Disk usage summary of snapshots causes renamed blocks to get counted twice
  • HDFS-10807 - Doc about upgrading to a version of HDFS with snapshots may be confusing
  • HDFS-10809 - getNumEncryptionZones causes NPE in branch-2.7
  • HDFS-10823 - Implement HttpFSFileSystem#listStatusIterator
  • HDFS-10832 - Propagate ACL bit and isEncrypted bit in HttpFS FileStatus permissions
  • HDFS-10837 - Standardize serializiation of WebHDFS DirectoryListing
  • HDFS-10870 - Wrong dfs.namenode.acls.enabled default in HdfsPermissionsGuide.apt.vm
  • HDFS-10875 - Optimize du -x to cache intermediate result
  • HDFS-10876 - Dispatcher#dispatch should log IOException stacktrace
  • HDFS-10878 - TestDFSClientRetries#testIdempotentAllocateBlockAndClose throws ConcurrentModificationException
  • HDFS-10883 - `getTrashRoot`'s behavior is not consistent in DFS after enabling EZ
  • HDFS-10915 - Fix time measurement bug in TestDatanodeRestart
  • HDFS-10918 - Add a tool to get FileEncryptionInfo from CLI
  • HDFS-10960 - TestDataNodeHotSwapVolumes#testRemoveVolumeBeingWritten fails at disk error verification after volume remove
  • HDFS-11009 - Add a tool to reconstruct block meta file from CLI
  • HDFS-11015 - Enforce timeout in balancer
  • HDFS-11053 - Unnecessary superuser check in versionRequest()
  • HDFS-11069 - Tighten the authorization of datanode RPC
  • HDFS-11080 - Update HttpFS to use ConfigRedactor
  • HDFS-11120 - TestEncryptionZones should waitActive
  • HDFS-11229 - HDFS-11056 failed to close meta file
  • MAPREDUCE-6497 - Fix wrong value of JOB_FINISHED event in JobHistoryEventHandler
  • MAPREDUCE-6541 - Exclude scheduled reducer memory when calculating available mapper slots from headroom to avoid deadlock
  • MAPREDUCE-6579 - JobStatus#getFailureInfo should not output diagnostic information when the job is running
  • MAPREDUCE-6740 - Enforce mapreduce.task.timeout to be at least mapreduce.task.progress-report.interval
  • MAPREDUCE-6750 - Fix TestHSAdminServer#testRefreshSuperUserGroups
  • MAPREDUCE-6763 - Shuffle server listen queue is too small
  • MAPREDUCE-6764 - Teragen LOG initialization bug
  • MAPREDUCE-6765 - MR should not schedule container requests in cases where reducer or mapper containers demand resource larger than the maximum supported
  • MAPREDUCE-6776 - yarn.app.mapreduce.client.job.max-retries should have a more useful default
  • MAPREDUCE-6789 - Fix TestAMWebApp failure
  • MAPREDUCE-6801 - Fix flaky TestKill.testKillJob
  • YARN-2246 - Made the proxy tracking URL always be http(s)://proxy addr:port/proxy/<appId> to avoid duplicate sections
  • YARN-2913 - DOCS. Fair scheduler should have ability to set MaxResourceDefault for each queue
  • YARN-2980 - Move health check script related functionality to hadoop-common
  • YARN-3094 - Reset timer for liveness monitors after RM recovery
  • YARN-3223 - Resource update during NM graceful decommission
  • YARN-3239 - WebAppProxy does not support a final tracking url which has query fragments and params
  • YARN-3375 - NodeHealthScriptRunner.shouldRun() check is performing 3 times for starting NodeHealthScriptRunner
  • YARN-3412 - RM tests should use MockRM where possible
  • YARN-3582 - NPE in WebAppProxyServlet
  • YARN-3893 - Both RM in active state when Admin#transitionToActive failure from refeshAll()
  • YARN-4115 - Reduce loglevel of ContainerManagementProtocolProxy to Debug
  • YARN-4132 - Separate configs for nodemanager to resourcemanager connection timeout and retries
  • YARN-4201 - AMBlacklist does not work for minicluster
  • YARN-4710 - Reduce logging application reserved debug info in FSAppAttempt#assignContainer
  • YARN-4743 - FairSharePolicy breaks TimSort assumption
  • YARN-4767 - Network issues can cause persistent RM UI outage
  • YARN-4794 - Deadlock in NMClientImpl
  • YARN-4911 - Bad placement policy in FairScheduler causes the RM to crash
  • YARN-4927 - TestRMHA#testTransitionedToActiveRefreshFail fails with FairScheduler
  • YARN-5009 - NMLeveldbStateStoreService database can grow substantially leading to longer recovery times
  • YARN-5082 - Limit ContainerId increase in fair scheduler if the num of node app reserved reached the limit
  • YARN-5197 - RM leaks containers if running container disappears from node update
  • YARN-5353 - ResourceManager can leak delegation tokens when they are shared across apps
  • YARN-5453 - FairScheduler#update may skip update demand resource of child queue/app if current demand reached maxResource
  • YARN-5462 - TestNodeStatusUpdater.testNodeStatusUpdaterRetryAndNMShutdown fails intermittently
  • YARN-5616 - Clean up WeightAdjuster
  • YARN-5672 - FairScheduler: Wrong queue name in log when adding application
  • YARN-5677 - RM should transition to standby when connection is lost for an extended period
  • YARN-5693 - Reduce loglevel to Debug in ContainerManagementProtocolProxy and AMRMClientImpl
  • YARN-5694 - ZKRMStateStore can prevent the transition to standby if the ZK node is unreachable
  • YARN-5736 - Addendum. Fixes segfault due to unterminated string
  • YARN-5736 - YARN container executor config does not handle white space
  • YARN-5754 - Null check missing for earliest in FifoPolicy
  • YARN-5834 - TestNodeStatusUpdater.testNMRMConnectionConf compares nodemanager wait time to the incorrect value.
  • YARN-5942 - "Overridden" is misspelled as "overriden" in FairScheduler.md
  • HBASE-12940 - Expose listPeerConfigs and getPeerConfig to the HBase shell
  • HBASE-15297 - Correct handling of namespace existence checks in shell
  • HBASE-15393 - Enable table replication command will fail when parent znode is not default in peer cluster
  • HBASE-15526 - Make SnapshotManager accessible through MasterServices
  • HBASE-15633 - Backport HBASE-15507 to branch-1
  • HBASE-15769 - Perform validation on cluster key for add_peer
  • HBASE-16146 - Counters are expensive
  • HBASE-16464 - archive folder grows bigger and bigger due to corrupt snapshot under tmp dir
  • HBASE-16490 - Fix race condition between SnapshotManager and SnapshotCleaner
  • HBASE-16653 - Backport HBASE-11393 to branches which support namespace
  • HBASE-17058 - Lower epsilon used for jitter verification from HBASE-15324
  • HBASE-17072 - CPU usage starts to climb up to 90-100% when using G1GC; purge ThreadLocal usage
  • HIVE-4924 - JDBC: Support query timeout for jdbc
  • HIVE-9423 - HiveServer2: Provide the user with different error messages depending on the Thrift client exception code
  • HIVE-9518 - Implement MONTHS_BETWEEN aligned with Oracle one
  • HIVE-9518 - Implement MONTHS_BETWEEN aligned with Oracle one
  • HIVE-9664 - Hive 'add jar' command should be able to download and add jars from a repository
  • HIVE-10267 - HIVE-9664 makes hive depend on ivysettings.xml : trivial breakage fix
  • HIVE-10276 - Implement date_format(timestamp, fmt) UDF
  • HIVE-10276 - Implement date_format(timestamp, fmt) UDF
  • HIVE-10576 - Add jar command does not work with Windows OS
  • HIVE-10644 - Create SHA2 UDF
  • HIVE-10644 - Create SHA2 UDF
  • HIVE-11032 - Enable more tests for grouping by skewed data
  • HIVE-11538 - Add an option to skip init script while running tests
  • HIVE-11920 - ADD JAR failing with URL schemes other than file/ivy/hdfs
  • HIVE-12619 - Switching the field order within an array of structs causes the query to fail
  • HIVE-12646 - Revert "beeline and HIVE CLI do not parse ; in quote properly"
  • HIVE-12653 - The property "serialization.encoding" in the class "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work
  • HIVE-12908 - Improve dynamic partition loading III
  • HIVE-12988 - Improve dynamic partition loading IV
  • HIVE-13033 - SPDO unnecessarily duplicates columns in key & value of mapper output
  • HIVE-13129 - CliService leaks HMS connection
  • HIVE-13539 - HiveHFileOutputFormat searching the wrong directory for HFiles
  • HIVE-13705 - Insert into table removes existing data
  • HIVE-13716 - Improve dynamic partition loading V
  • HIVE-13760 - Add a HIVE_QUERY_TIMEOUT configuration to kill a query if a query is running for more than the configured timeout value.
  • HIVE-13786 - Fix the unit test failure org.apache.hive.service.cli.session.TestHiveSessionImpl.testLeakOperationHandle
  • HIVE-13904 - Ignore case when retrieving ColumnInfo from RowResolver
  • HIVE-13911 - Load inpath fails throwing org.apache.hadoop.security.AccessControlException
  • HIVE-13933 - Add an option to turn off parallel file moves
  • HIVE-13936 - Add streaming support for row_number
  • HIVE-13960 - Session may timeout before idle timeout time for synchronous operations
  • HIVE-14011 - MessageFactory is not pluggable
  • HIVE-14100 - Adding a new logged_in_user() UDF which returns the user provided when connecting
  • HIVE-14175 - Fix creating buckets without scheme information
  • HIVE-14301 - insert overwrite fails for nonpartitioned tables in s3
  • HIVE-14358 - Add metrics for number of queries executed for each execution engine
  • HIVE-14373 - Add integration tests for hive on S3
  • HIVE-14444 - Upgrade qtest execution framework to junit4 - migrate most of them
  • HIVE-14753 - Track the number of open/closed/abandoned sessions in HS2
  • HIVE-14775 - Cleanup IOException usage in Metrics APIs
  • HIVE-14784 - Operation logs are disabled automatically if the parent directory does not exist
  • HIVE-14822 - Add support for credential provider for jobs launched from Hiveserver2
  • HIVE-14924 - MSCK REPAIR table with single threaded is throwing null pointer exception
  • HIVE-15000 - Remove addlocaldriverjar, and addlocaldrivername from command line help
  • HIVE-15022 - Missing hs2-connection-timed-out in BeeLine.properties
  • HIVE-15114 - Added few more test cases to validateissue
  • HIVE-15114 - Remove extra MoveTask operators from the ConditionalTask
  • HIVE-15121 - Last MR job in Hive should be able to write to a different scratch directory
  • HIVE-15199 - INSERT INTO data on S3 is replacing the old rows with the new ones
  • HIVE-15226 - Add a different masking comment to qtests blobstore output
  • HIVE-15246 - Add a making comment to blobstore staging paths on qtest output
  • HIVE-15266 - Edit test output of negative blobstore tests to match HIVE-15226
  • HIVE-15280 - Hive.mvFile() misses the "." char when joining the filename + extension
  • HIVE-15291 - Comparison of timestamp fails if only date part is provided
  • HIVE-15355 - Concurrency issues during parallel moveFile due to HDFSUtils.setFullFileStatus
  • HIVE-15361 - INSERT dynamic partition on S3 fails with a MoveTask failure
  • HIVE-15363 - Execute hive-blobstore tests using ProxyLocalFileSystem
  • HIVE-15367 - CTAS with LOCATION should write temp data under location directory rather than database location
  • HIVE-15385 - Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., false) causes queries to fail
  • IMPALA-1169 - Admission control info on the queries debug webpage
  • IMPALA-1286 - Extract common conjuncts from disjunctions.
  • IMPALA-1430 - IMPALA-4108: codegen all builtin aggregate functions
  • IMPALA-1616 - Improve the Memory Limit Exceeded error report
  • IMPALA-1654 - General partition exprs in DDL operations.
  • IMPALA-1702 - Enforce single-table consistency in query analysis
  • IMPALA-1788 - Fold constant expressions
  • IMPALA-2013 - Reintroduce steps for checking HBase health in run-hbase.sh
  • IMPALA-2057 - Better error message for incorrect avro decimal column declaration
  • IMPALA-2521 - Add clustered hint to insert statements
  • IMPALA-2523 - Make HdfsTableSink aware of clustered input
  • IMPALA-2789 - More compact mem layout with null bits at the end.
  • IMPALA-2864 - Ensure that client connections are closed after a failed Open()
  • IMPALA-2890 - Support ALTER TABLE statements for Kudu tables
  • IMPALA-2905 - Move QueryResultSet implementations into separate module
  • IMPALA-2905 - Handle coordinator fragment lifecycle like all others
  • IMPALA-2916 - Add warning to query profile if debug build
  • IMPALA-2925 - Mark test_alloc_update as xfail.
  • IMPALA-3002 - IMPALA-1473: Cardinality observability cleanup
  • IMPALA-3125 - Fix assignment of equality predicates from an outer-join On-clause
  • IMPALA-3126 - Conservative assignment of inner-join On-clause predicates
  • IMPALA-3167 - Fix assignment of WHERE conjunct through grouping agg + OJ
  • IMPALA-3200 - move bufferpool under runtime
  • IMPALA-3201 - in-memory buffer pool implementation
  • IMPALA-3201 - reservation implementation for new buffer pool
  • IMPALA-3202 - refactor scratch file management into TmpFileMgr
  • IMPALA-3202 - DiskIoMgr improvements for new buffer pool
  • IMPALA-3211 - provide toolchain build id for bootstrapping
  • IMPALA-3221 - Copyright / license audit
  • IMPALA-3229 - Don't assume that AUX exists just because of shell env
  • IMPALA-3308 - Get expr-test passing on PPC64LE
  • IMPALA-3314 - Fix Avro schema loading for partitioned tables
  • IMPALA-3342 - Add thread counters to monitor plan fragment execution
  • IMPALA-3346 - DeepCopy() Kudu rows into Impala tuples
  • IMPALA-3348 - Avoid per-slot check vector size in KuduScanner
  • IMPALA-3398 - Add docs to main Impala branch
  • IMPALA-3420 - use gold by default
  • IMPALA-3481 - Use Kudu ScanToken API for scan ranges
  • IMPALA-3491 - Use unique db in test_scanners.py and test_aggregation.py
  • IMPALA-3491 - Use unique database fixture in test_partitioning.py
  • IMPALA-3491 - Use unique database fixture in test_insert_parquet.py
  • IMPALA-3491 - Use unique database fixture in test_nested_types.py
  • IMPALA-3491 - Use unique database fixture in test_ddl.py
  • IMPALA-3552 - Make incremental stats max serialized size configurable
  • IMPALA-3567 - Part 2, IMPALA-3899: factor out PHJ builder
  • IMPALA-3567 - move ExecOption profile helpers to RuntimeProfile
  • IMPALA-3586 - Clean up union-node.h/cc to enable improvements.
  • IMPALA-3644 - Make predicate order deterministic
  • IMPALA-3671 - Add query option to limit scratch space usage
  • IMPALA-3676 - Use clang as a static analysis tool
  • IMPALA-3710 - Kudu DML should ignore conflicts, pt2
  • IMPALA-3710 - Kudu DML should ignore conflicts by default
  • IMPALA-3713 - ,IMPALA-4439: Fix Kudu DML shell reporting
  • IMPALA-3718 - Add test_cancellation tests for Kudu
  • IMPALA-3718 - Support subset of functional-query for Kudu
  • IMPALA-3719 - Simplify CREATE TABLE statements with Kudu tables
  • IMPALA-3724 - Support Kudu non-covering range partitions
  • IMPALA-3725 - Support Kudu UPSERT in Impala
  • IMPALA-3726 - Add support for Kudu-specific column options
  • IMPALA-3739 - Enable stress tests on Kudu
  • IMPALA-3771 - Expose kudu client timeout and set default
  • IMPALA-3786 - Replace "cloudera" with "apache"
  • IMPALA-3786 - Replace "cloudera" with "apache"
  • IMPALA-3788 - Fix Kudu ReadMode flag checking
  • IMPALA-3788 - Add flag for Kudu read-your-writes
  • IMPALA-3788 - Support for Kudu 'read-your-writes' consistency
  • IMPALA-3808 - Add incubating DISCLAIMER from the Incubator Branding Guide
  • IMPALA-3809 - Show Kudu-specific column metadata in DESCRIBE.
  • IMPALA-3812 - Fix error message for unsupported types
  • IMPALA-3815 - clean up cross-compiled comparator
  • IMPALA-3823 - Add timer to measure Parquet footer reads
  • IMPALA-3838 - IMPALA-4495: Codegen EvalRuntimeFilters() and fixes filter stats updates
  • IMPALA-3853 - More RAT cleaning.
  • IMPALA-3853 - squeasel is MIT (and dual copyright) not Apache
  • IMPALA-3872 - allow providing PyPi mirror for python packages
  • IMPALA-3875 - Thrift threaded server hang in some cases
  • IMPALA-3884 - Support TYPE_TIMESTAMP for HashTableCtx::CodegenAssignNullValue()
  • IMPALA-3902 - Scheduler improvements for running multiple fragment instances on a single backend
  • IMPALA-3905 - Add single-threaded scan node.
  • IMPALA-3912 - test_random_rpc_timeout is flaky.
  • IMPALA-3918 - Fix straggler Cloudera -> ASF license headers
  • IMPALA-3920 - TotalStorageWaitTime counter not populated for fragments with Kudu scan node
  • IMPALA-3943 - Address post-merge comments.
  • IMPALA-3971 - IMPALA-3229: Bootstrap an Impala dev environment
  • IMPALA-3973 - add position and occurrence to instr()
  • IMPALA-3980 - qgen: re-enable Hive as a target database
  • IMPALA-3983 - /IMPALA-3974: Delete function jar resources after load
  • IMPALA-4000 - Restricted Sentry authorization for Kudu Tables
  • IMPALA-4006 - Fix typo in buildall.sh introduced in
  • IMPALA-4006 - dangerous rm -rf statements in scripts
  • IMPALA-4008 - Don't bake ExprContext pointers into IR code
  • IMPALA-4008 - don't bake in hash table and hash join pointers
  • IMPALA-4011 - Remove / reword messages when statestore messages are late
  • IMPALA-4020 - Handle external conflicting changes to HMS gracefully
  • IMPALA-4023 - don't attach buffered tuple streams to batches
  • IMPALA-4026 - Implement double-buffering for BlockingQueue
  • IMPALA-4028 - Improve message for improper Sentry config to make extra spaces visible
  • IMPALA-4037 - IMPALA-4038: fix locking during query cancellation
  • IMPALA-4042 - Preserve root types when substituting grouping exprs
  • IMPALA-4047 - Remove occurrences of 'CDH'/'cdh' from repo
  • IMPALA-4048 - Misc. improvements to /sessions
  • IMPALA-4054 - Remove serial test workarounds for IMPALA-2479.
  • IMPALA-4056 - Fix toSql() of DistributeParam
  • IMPALA-4058 - benchmark byteswap on misaligned memory
  • IMPALA-4074 - Configuration items duplicate in template of YARN
  • IMPALA-4080 - IMPALA-3638: Introduce ExecNode::Codegen()
  • IMPALA-4087 - TestFragmentLifecycle.test_failure_in_prepare
  • IMPALA-4091 - Fix backend unit to log in logs/be_tests.
  • IMPALA-4096 - Allow clean.sh to work from snapshots
  • IMPALA-4097 - Crash in kudu-scan-node-test
  • IMPALA-4098 - Open()/Close() partition exprs once per fragment instance.
  • IMPALA-4100 - 4112: Qgen: Replace EXTRACT UDF + IS [NOT] DISTINCT FROM in HiveSqlWriter
  • IMPALA-4101 - qgen: Hive join predicates should only contains equality functions
  • IMPALA-4102 - Remote Kudu reads should be reported
  • IMPALA-4104 - add DCHECK to ConsumeLocal() and fix tests
  • IMPALA-4110 - IMPALA-3853: npm.js uses Artistic License 2
  • IMPALA-4110 - Clean up issues found by Apache RAT
  • IMPALA-4110 - Apache RAT script on Impala tarballs
  • IMPALA-4111 - backend death tests should not produce minidumps
  • IMPALA-4116 - Remove 'cdh' from version string again
  • IMPALA-4116 - Remove 'cdh' from version string
  • IMPALA-4117 - Factor simple scheduler test code into own files
  • IMPALA-4118 - extract encryption utils from BufferedBlockMgr
  • IMPALA-4122 - qgen: fix bitrotted cluster unit tests
  • IMPALA-4123 - Fast bit unpacking
  • IMPALA-4134 - IMPALA-3704: Kudu INSERT improvements
  • IMPALA-4136 - testKudu planner test hangs if Kudu is not supported
  • IMPALA-4138 - Fix AcquireState() for batches that change capacity
  • IMPALA-4142 - qgen: Hive does not support CTEs inside sub-query blocks
  • IMPALA-4155 - Update default partition when table is altered
  • IMPALA-4160 - Remove some leftover Llama references
  • IMPALA-4160 - Remove Llama support.
  • IMPALA-4171 - Remove JAR from repo.
  • IMPALA-4180 - Synchronize accesses to RuntimeState::reader_contexts_
  • IMPALA-4187 - Switch RPC latency metrics to histograms
  • IMPALA-4188 - Leopard: support external Docker volumes
  • IMPALA-4193 - Warn when benchmarks run with sub-optimal CPU settings
  • IMPALA-4194 - Bump version to 2.8.0
  • IMPALA-4199 - Add 'SNAPSHOT' to Impala version
  • IMPALA-4204 - Remove KuduScanNodeTest
  • IMPALA-4205 - fix tmp-file-mgr-test under ASAN
  • IMPALA-4206 - Add column lineage regression test.
  • IMPALA-4207 - test infra: move Hive options from connection to cluster options
  • IMPALA-4213 - Fix Kudu predicates that need constant folding
  • IMPALA-4230 - ASF policy issues from 2.7.0 rc3.
  • IMPALA-4231 - fix codegen time regression
  • IMPALA-4232 - qgen: Hive does not support aggregates inside specific analytic clauses
  • IMPALA-4234 - Remove astyle config file, looks outdated.
  • IMPALA-4239 - fix buffer pool test failures in release build
  • IMPALA-4240 - qgen: Add "ParseException line missing )" to Known Errors for Hive
  • IMPALA-4241 - remove spurious child queries event
  • IMPALA-4253 - impala-server.backends.client-cache.total-clients shows negative value
  • IMPALA-4258 - Remove duplicated and unused test macros
  • IMPALA-4259 - build Impala without any test cluster setup
  • IMPALA-4260 - Alter table add column drops all the column stats
  • IMPALA-4266 - Java udf returning string can give incorrect results
  • IMPALA-4269 - Codegen merging exchange node
  • IMPALA-4270 - Gracefully fail unsupported queries with mt_dop > 0.
  • IMPALA-4274 - hang in buffered-block-mgr-test
  • IMPALA-4277 - remove unneeded LegacyTCLIService
  • IMPALA-4277 - remove references for unsupported s3/s3n connectors
  • IMPALA-4277 - allow overriding of Hive/Hadoop versions/locations
  • IMPALA-4278 - Don't abort Catalog startup quickly if HMS is not present
  • IMPALA-4283 - Ensure Kudu-specific lineage and audit behavior
  • IMPALA-4285 - /IMPALA-4286: Fixes for Parquet scanner with MT_DOP > 0.
  • IMPALA-4287 - EE tests fail to run when KUDU_IS_SUPPORTED=false
  • IMPALA-4289 - Mark agg slots of NDV() functions as non-nullable
  • IMPALA-4291 - Reduce LLVM module's preparation time
  • IMPALA-4294 - Make check-schema-diff.sh executable from anywhere.
  • IMPALA-4295 - XFAIL wildcard SSL test
  • IMPALA-4299 - add buildall.sh option to start test cluster
  • IMPALA-4300 - Speed up BloomFilter::Or with SIMD
  • IMPALA-4302 - ,IMPALA-2379: constant expr arg fixes
  • IMPALA-4303 - Do not reset() qualifier of union operands.
  • IMPALA-4309 - Introduce Expr rewrite phase and supporting classes
  • IMPALA-4310 - Make push_to_asf.py respect --apache_remote
  • IMPALA-4314 - Standardize on MT-related data structures
  • IMPALA-4325 - StmtRewrite lost parentheses of CompoundPredicate
  • IMPALA-4330 - Fix JSON syntax in generate_metrics.py
  • IMPALA-4335 - Don't send 0-row batches to clients
  • IMPALA-4338 - test infra data migrator: include tables' primary keys in PostgreSQL
  • IMPALA-4339 - ensure coredumps end up in IMPALA_HOME
  • IMPALA-4340 - explain how to install postgresql-9.5 or higher
  • IMPALA-4343 - IMPALA-4354: qgen: model INSERTs; write INSERTs from query model
  • IMPALA-4348 - / IMPALA-4333: Improve coordinator fragment cancellation
  • IMPALA-4350 - Crash with vlog level 2 in hash join node
  • IMPALA-4352 - test infra: store Impala/Kudu primary keys in object model
  • IMPALA-4357 - Fix DROP TABLE to pass analysis if the table fails to load
  • IMPALA-4362 - Misc. fixes for PFE counters
  • IMPALA-4363 - Add Parquet timestamp validation
  • IMPALA-4365 - Enabling end-to-end tests on a remote cluster
  • IMPALA-4369 - Avoid DCHECK in Parquet scanner with MT_DOP > 0
  • IMPALA-4371 - Incorrect DCHECK-s in hdfs-parquet-table-writer
  • IMPALA-4372 - 'Describe formatted' returns types in upper case
  • IMPALA-4374 - Use new syntax for creating TPC-DS/H tables in Kudu stress test
  • IMPALA-4377 - Fix Java UDF-arg buffer use-after-free in UdfExecutorTest.
  • IMPALA-4379 - Fix and test Kudu table type checking, follow up
  • IMPALA-4379 - Fix and test Kudu table type checking
  • IMPALA-4380 - Remove 'cloudera' from hostnames in bin/generate_minidump_collection_testdata.py
  • IMPALA-4381 - Incorrect AVX version of BloomFilter::Or
  • IMPALA-4383 - Ensure plan fragment report thread is always started
  • IMPALA-4384 - NPE when cols list has trailing comma
  • IMPALA-4388 - Fix query option reset in tests
  • IMPALA-4391 - fix dropped statuses in scanners
  • IMPALA-4392 - restore PeakMemoryUsage to DataSink profiles
  • IMPALA-4397 - addendum: remove stray semicolon
  • IMPALA-4397 - IMPALA-3259: reduce codegen time and memory
  • IMPALA-4403 - Implement SHOW RANGE PARTITIONS for Kudu tables
  • IMPALA-4406 - Add cryptography export control notice
  • IMPALA-4408 - Omit null bytes for Kudu scans with no nullable slots.
  • IMPALA-4409 - respect lock order in QueryExecState::CancelInternal()
  • IMPALA-4410 - Safer tear-down of RuntimeState
  • IMPALA-4411 - Kudu inserts violate lock ordering and could deadlock
  • IMPALA-4412 - Per operator timing in profile summary is incorrect when mt_dop > 0
  • IMPALA-4415 - Fix unassigned scan range of size 1
  • IMPALA-4421 - Send custom cluster & process failure test results to logs/
  • IMPALA-4427 - leopard: make DOCKER_IMAGE_NAME required
  • IMPALA-4432 - Handle internal codegen disabling properly
  • IMPALA-4433 - Always generate testdata using the same time zone setting
  • IMPALA-4434 - In Python, ''.split('\n') is [''], which has length 1
  • IMPALA-4435 - Fix in-predicate-benchmark linking by moving templates
  • IMPALA-4436 - StringValue::StringCompare() should match strncmp()
  • IMPALA-4437 - fix crash in disk-io-mgr
  • IMPALA-4437 - hit DCHECK in buffered-block-mgr-test
  • IMPALA-4438 - Serialize test_failpoints.py to reduce memory pressure
  • IMPALA-4440 - lineage timestamps can go backwards across daylight savings transitions
  • IMPALA-4441 - Divide-by-zero in RuntimeProfile::SummaryStatsCounter::SetStats
  • IMPALA-4442 - Fix FE ParserTests UnsatisfiedLinkError
  • IMPALA-4444 - Transfer row group resources to row batch on scan failure
  • IMPALA-4446 - expr-test fails under ASAN
  • IMPALA-4447 - Rein in overly broad sed that dirties the tree
  • IMPALA-4450 - qgen: use string concatenation operator for postgres queries
  • IMPALA-4452 - Always call AggFnEvaluator::Open() before AggFnEvaluator::Init()
  • IMPALA-4454 - test_kudu.TestShowCreateTable flaky
  • IMPALA-4455 - MemPoolTest.TryAllocateAligned failure: sizeof v. alignof
  • IMPALA-4458 - Fix resource cleanup of cancelled mt scan nodes.
  • IMPALA-4461 - Make sure data gets loaded for wide hbase tables.
  • IMPALA-4465 - Don't hold process wide lock while serializing Runtime Profile in GetRuntimeProfileStr()
  • IMPALA-4466 - Improve Kudu CRUD test coverage
  • IMPALA-4470 - Avoid creating a NumericLiteral from NaN/infinity/-0
  • IMPALA-4476 - Use unique_database to stop races in test_udfs.py
  • IMPALA-4477 - Bump Kudu version to latest master
  • IMPALA-4477 - Upgrade Kudu version to latest master
  • IMPALA-4477 - Upgrade Kudu version to latest master
  • IMPALA-4477 - Upgrade Kudu version to latest master
  • IMPALA-4478 - Initial Kudu client mem tracking for sink
  • IMPALA-4479 - Use correct isSet() thrift function when evaluating constant bool exprs
  • IMPALA-4480 - zero_length_region_ must be as aligned as max_align_t
  • IMPALA-4488 - HS2 GetOperationStatus() should keep session alive
  • IMPALA-4490 - Only generate runtime filters for hash join nodes.
  • IMPALA-4493 - fix string-compare-test when using clang
  • IMPALA-4494 - Fix crash in SimpleScheduler
  • IMPALA-4497 - Fix Kudu client crash w/ SASL initialization
  • IMPALA-4498 - crash in to_utc_timestamp/from_utc_timestamp
  • IMPALA-4502 - test_partition_ddl_predicates breaks on non-HDFS filesystems
  • IMPALA-4504 - fix races in PlanFragmentExecutor regarding status reporting
  • IMPALA-4509 - Initialise Sasl-specific mutex
  • IMPALA-4510 - Selectively filter args for metric verification tests
  • IMPALA-4511 - Add missing total_time_counter() to PFE::Exec()
  • IMPALA-4512 - Add a script that builds Impala on stock Ubuntu 14.04
  • IMPALA-4514 - Fix broken exhaustive builds caused by non-nullable columns
  • IMPALA-4516 - Don't hold process wide lock connection_to_sessions_map_lock_ while cancelling queries
  • IMPALA-4518 - CopyStringVal() doesn't copy null string
  • IMPALA-4519 - increase timeout in TestFragmentLifecycle
  • IMPALA-4522 - Bound Kudu client threads to avoid stress crash
  • IMPALA-4523 - Correct max VARCHAR size to 65535 (2^16 - 1).
  • IMPALA-4525 - follow-on: cleanup error handling
  • IMPALA-4525 - fix crash when codegen mem limit exceeded
  • IMPALA-4527 - Columns in Kudu tables created from Impala default to "NULL"
  • IMPALA-4529 - speed up parsing of identifiers
  • IMPALA-4532 - Fix use-after-free in ProcessBuildInputAsync()
  • IMPALA-4535 - Remove 'auto' from parameter list
  • IMPALA-4539 - fix bug when scratch batch references I/O buffers
  • IMPALA-4540 - Function call in DCHECK crashes scheduler
  • IMPALA-4541 - fix test dimensions for test_codegen_mem_limit
  • IMPALA-4542 - Fix use-after-free in some BE tests
  • IMPALA-4550 - Fix CastExpr analysis for substituted slots
  • IMPALA-4553 - ntpd must be synchronized for kudu to start.
  • IMPALA-4554 - fix projection of nested collections with mt_dop > 0
  • IMPALA-4557 - Fix flakiness with FLAGS_stress_free_pool_alloc
  • IMPALA-4561 - Replace DISTRIBUTE BY with PARTITION BY in CREATE TABLE
  • IMPALA-4562 - Fix for crash on kerberized clusters w/o Kudu support
  • IMPALA-4564 - ,IMPALA-4565: mt_dop fixes for old aggs and joins
  • IMPALA-4566 - Kudu client glog contention can cause timeouts
  • IMPALA-4567 - Fix test_kudu_alter_table exhaustive failures
  • IMPALA-4570 - shell tarball breaks with certain setuptools versions
  • IMPALA-4571 - Push IN predicates to Kudu
  • IMPALA-4572 - Run COMPUTE STATS on Parquet tables with MT_DOP=4
  • IMPALA-4574 - Do not treat UUID() like a constant expr
  • IMPALA-4577 - Adjust maximum size of row batch queue with MT_DOP
  • IMPALA-4578 - Pick up bound predicates for Kudu scan nodes.
  • IMPALA-4579 - SHOW CREATE VIEW fails for view containing a subquery
  • IMPALA-4580 - Fix crash with FETCH_FIRST when #rows < result cache size
  • IMPALA-4584 - Make alter table operations on Kudu tables synchronous
  • IMPALA-4585 - Allow the $DATABASE template in the CATCH section
  • IMPALA-4586 - don't constant fold in backend
  • IMPALA-4592 - Improve error msg for non-deterministic predicates
  • IMPALA-4594 - WriteSlot and CodegenWriteSlot handle escaped NULL slots differently
  • IMPALA-4595 - Ignore discarded functions after linking
  • IMPALA-4608 - Fix fragment completion times for INSERTs
  • IMPALA-4609 - prefix thread counters in fragment profile
  • IMPALA-4613 - Make sure timers are finished before sending report profile
  • IMPALA-4614 - Set eval cost of timestamp literals.
  • IMPALA-4619 - Allow NULL as default value in Kudu tables
  • IMPALA-4628 - Disable broken kudu test to unblock GVOs
  • IMPALA-4630 - remove debug webpage easter egg
  • IMPALA-4633 - Change broken gflag default for Kudu client mem
  • IMPALA-4636 - Correct Suse Linux distro string
  • IMPALA-4636 - Add support for SLES12 for Kudu integration
  • IMPALA-4638 - Run queries with MT_DOP through admission control
  • IMPALA-4642 - Fix TestFragmentLifecycle failures; kudu test must wait
  • IMPALA-4654 - KuduScanner must return when ReachedLimit()
  • IMPALA-4659 - fuzz test fixes
  • IMPALA-4739 - ExprRewriter fails on HAVING clauses
  • IMPALA-4765 - Avoid using several loading threads on one table
  • IMPALA-4768 - Improve logging of table loading
  • IMPALA-3905 - Add single-threaded scan node
  • IMPALA-4262 - LZO-scanner fails when reading large index files from S3
  • IMPALA-4277 - Merge "build against hadoop components in different location" into cdh5-trunk
  • IMPALA-4277 - build against hadoop components in different location
  • IMPALA-4322 - test_scanners_fuzz.py hits a DCHECK
  • IMPALA-4391 - fix dropped status in scanners
  • OOZIE-2194 - oozie job -kill doesn't work with spark action
  • OOZIE-2225 - Fix test failures for
  • OOZIE-2225 - Add wild card filter for gathering jobs
  • OOZIE-2273 - MiniOozie does not work outside of Oozie
  • OOZIE-2430 - amendAdd root logger for hive,sqoop action
  • OOZIE-2430 - Add root logger for hive,sqoop action
  • OOZIE-2471 - Show child job url tab for distcp
  • OOZIE-2503 - show ChildJobURLs to spark action
  • OOZIE-2517 - Add support for startCreatedTime and endCreatedTime filters for coord and bundles
  • OOZIE-2520 - SortBy filter for ordering the jobs query results
  • OOZIE-2547 - Add mapreduce.job.cache.files to spark action
  • OOZIE-2552 - Update ActiveMQ version for security and other fixes
  • OOZIE-2563 - Pass spark-defaults.conf to spark action
  • OOZIE-2569 - Adding yarn-site, core-site, hdfs-site and mapred-site into spark launcher
  • OOZIE-2606 - Set spark.yarn.jars to fix Spark 2.0 with Oozie
  • OOZIE-2621 - Use hive-exec-<version>-core instead of hive-exec in oozie-core
  • OOZIE-2658 - -driver-class-path can overwrite the classpath in SparkMain
  • OOZIE-2678 - Oozie job -kill doesn't work with tez jobs
  • OOZIE-2705 - Oozie Spark action ignores spark.executor.extraJavaOptions and spark.driver.extraJavaOptions
  • OOZIE-2731 - Set yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage to a higher value in tests
  • PARQUET-358 - Add support for Avro's logical types API
  • PARQUET-415 - Fix ByteBuffer Binary serialization
  • PIG-4052 - TestJobControlSleep, TestInvokerSpeed are unreliable
  • PIG-5025 - Fix flaky test failures in TestLoad.java
  • SENTRY-1260 - Improve error handling when ArrayIndexOutOfBoundsException in PathsUpdate.parsePath causes MetastoreCacheInitializer initialization to fail
  • SENTRY-1270 - Improve error handling - Database with malformed URI causes NPE in HMS plugin during DDL
  • SENTRY-1489 - Categorize end-to-end (e2e) tests into slow and regular tests (adapting tests for timeout)
  • SENTRY-1497 - Create a Sentry scale test tool to add various objects and privileges into Sentry and HMS
  • SPARK-1239 - Improve fetching of map output statuses
  • SPARK-6005 - [TESTS] Fix flaky test: o.a.s.streaming.kafka.DirectKafkaStreamSuite.offset recovery
  • SPARK-8425 - Add blacklist mechanism for task scheduling
  • SPARK-10722 - RDDBlockId not found in driver-heartbeater
  • SPARK-11301 - [SQL] Fix case sensitivity for filter on partitioned col…
  • SPARK-11327 - [MESOS] spark-dispatcher doesn't pass along some spark properties
  • SPARK-11507 - [MLLIB] Add compact in Matrices fromBreeze
  • SPARK-11515 - [ML] QuantileDiscretizer should take random seed
  • SPARK-11624 - [SPARK-11972][SQL] fix commands that need hive to exec
  • SPARK-11823 - Ignores HiveThriftBinaryServerSuite's test jdbc cancel
  • SPARK-11823 - [SQL] Fix flaky JDBC cancellation test in HiveThriftBinaryServerSuite
  • SPARK-12006 - GaussianMixture.train crashes if an initial model is not None
  • SPARK-12316 - Wait a minutes to avoid cycle calling
  • SPARK-12447 - [YARN] Only update the states when executor is successfully launched
  • SPARK-12655 - [GRAPHX] GraphX does not unpersist RDDs
  • SPARK-12672 - Revert "[STREAMING][UI] Use the uiRoot function instead of default root path to gain the streaming batch url."
  • SPARK-12672 - [STREAMING][UI] Use the uiRoot function instead of default root path to gain the streaming batch url
  • SPARK-12712 - Fix failure in ./dev/test-dependencies when run against empty .m2 cache
  • SPARK-12874 - [ML] ML StringIndexer does not protect itself from column name duplication
  • SPARK-13023 - [PROJECT INFRA][FOLLOWUP][BRANCH-1.6] Unable to check `root` module ending up failure of Python tests
  • SPARK-13023 - [PROJECT INFRA][BRANCH-1.6] Fix handling of root module in modules_to_test()
  • SPARK-13112 - [CORE] Make sure RegisterExecutorResponse arrive before LaunchTask
  • SPARK-13207 - [SQL][BRANCH-1.6] Make partitioning discovery ignore _SUCCESS files
  • SPARK-13327 - [SPARKR] colnames()<- allows invalid column names
  • SPARK-13410 - [SQL] Support unionAll for DataFrames with UDT columns
  • SPARK-13439 - [MESOS] Document that spark.mesos.uris is comma-separated
  • SPARK-13444 - [MLLIB] QuantileDiscretizer chooses bad splits on large DataFrames
  • SPARK-13444 - Revert "[MLLIB] QuantileDiscretizer chooses bad splits on large DataFrames"
  • SPARK-13444 - [MLLIB] QuantileDiscretizer chooses bad splits on large DataFrames
  • SPARK-13454 - [SQL] Allow users to drop a table with a name starting with an underscore
  • SPARK-13465 - Add a task failure listener to TaskContext
  • SPARK-13473 - [SQL] Don't push predicate through project with nondeterministic field
  • SPARK-13474 - [PROJECT INFRA] Update packaging scripts to push artifacts to home.apache.org
  • SPARK-13475 - [TESTS][SQL] HiveCompatibilitySuite should still run in PR builder even if a PR only changes sql/core
  • SPARK-13482 - [MINOR][CONFIGURATION] Make consistency of the configuration named in TransportConf
  • SPARK-13519 - [CORE] Driver should tell Executor to stop itself when cleaning executor's state
  • SPARK-13522 - [CORE] Fix the exit log place for heartbeat
  • SPARK-13522 - [CORE] Executor should kill itself when it's unable to heartbeat to driver more than N times
  • SPARK-13566 - [CORE] Avoid deadlock between BlockManager and Executor Thread
  • SPARK-13599 - [BUILD] remove transitive groovy dependencies from spark-hive and spark-hiveserver
  • SPARK-13601 - [TESTS] use 1 partition in tests to avoid race conditions
  • SPARK-13601 - Call failure callbacks before writer.close()
  • SPARK-13631 - [CORE] Thread-safe getLocationsWithLargestOutputs
  • SPARK-13642 - [YARN][1.6-BACKPORT] Properly handle signal kill in ApplicationMaster
  • SPARK-13648 - Add Hive Cli to classes for isolated classloader
  • SPARK-13697 - [PYSPARK] Fix the missing module name of TransformFunctionSerializer.loads
  • SPARK-13705 - [DOCS] UpdateStateByKey Operation documentation incorrectly refers to StatefulNetworkWordCount
  • SPARK-13711 - [CORE] Don't call SparkUncaughtExceptionHandler in AppClient as it's in driver
  • SPARK-13755 - Escape quotes in SQL plan visualization node labels
  • SPARK-13760 - Revert "[SQL] Fix BigDecimal constructor for FloatType"
  • SPARK-13760 - [SQL] Fix BigDecimal constructor for FloatType
  • SPARK-13772 - [SQL] Fix data type mismatch for decimal
  • SPARK-13803 - restore the changes in SPARK-3411
  • SPARK-13806 - [SQL] fix rounding mode of negative float/double
  • SPARK-13845 - [CORE][BACKPORT-1.6] Using onBlockUpdated to replace onTaskEnd avioding driver OOM
  • SPARK-13901 - [CORE] correct the logDebug information when jump to the next locality level
  • SPARK-13958 - Executor OOM due to unbounded growth of pointer array in…
  • SPARK-14006 - [SPARKR] Fix SparkR lint-r test errors in branch-1.6
  • SPARK-14058 - [PYTHON] Incorrect docstring in Window.order
  • SPARK-14074 - [SPARKR] Specify commit sha1 ID when using install_github to install intr package
  • SPARK-14107 - [PYSPARK][ML] Add seed as named argument to GBTs in pyspark
  • SPARK-14138 - [SQL] Fix generated SpecificColumnarIterator code can exceed JVM size limit for cached DataFrames
  • SPARK-14149 - Log exceptions in tryOrIOException
  • SPARK-14159 - [ML] StringIndexerModel sets output column metadata incorrectly
  • SPARK-14187 - [MLLIB] Fix incorrect use of binarySearch in SparseMatrix
  • SPARK-14204 - [SQL] Register driverClass rather than user-specified class
  • SPARK-14219 - [GRAPHX] Fix `pickRandomVertex` not to fall into infinit…
  • SPARK-14232 - [WEBUI] Fix event timeline display issue when an executor is removed with a multiple line reason
  • SPARK-14243 - [CORE][BACKPORT-1.6] update task metrics when removing blocks
  • SPARK-14261 - [SQL] Memory leak in Spark Thrift Server
  • SPARK-14298 - [ML][MLLIB] LDA should support disable checkpoint
  • SPARK-14322 - [MLLIB] Use treeAggregate instead of reduce in OnlineLDAOptimizer
  • SPARK-14357 - [CORE] Properly handle the root cause being a commit denied exception
  • SPARK-14368 - [PYSPARK] Support python.spark.worker.memory with upper-case unit
  • SPARK-14454 - [1.6] Better exception handling while marking tasks as failed
  • SPARK-14468 - Always enable OutputCommitCoordinator
  • SPARK-14495 - [SQL][1.6] fix resolution failure of having clause with distinct aggregate function
  • SPARK-14544 - [SQL] improve performance of SQL UI tab
  • SPARK-14563 - [ML] use a random table name instead of __THIS__ in SQLTransformer
  • SPARK-14618 - [ML][DOC] Updated RegressionEvaluator.metricName param doc
  • SPARK-14665 - [ML][PYTHON] Fixed bug with StopWordsRemover default stopwords
  • SPARK-14671 - [ML] Pipeline setStages should handle subclasses of PipelineStage
  • SPARK-14757 - [SQL] Fix nullability bug in EqualNullSafe codegen
  • SPARK-14787 - [SQL] Upgrade Joda-Time library from 2.9 to 2.9.3
  • SPARK-14897 - [CORE] Upgrade Jetty to latest version of 8
  • SPARK-14965 - [SQL] Indicate an exception is thrown for a missing struct field
  • SPARK-15062 - [SQL] Backport fix list type infer serializer issue
  • SPARK-15091 - [SPARKR] Fix warnings and a failure in SparkR test cases with testthat version 1.0.1
  • SPARK-15209 - Fix display of job descriptions with single quotes in web UI timeline
  • SPARK-15223 - [DOCS] fix wrongly named config reference
  • SPARK-15260 - Atomically resize memory pools
  • SPARK-15262 - Synchronize block manager / scheduler executor state
  • SPARK-15395 - Revert "[CORE] Use getHostString to create RpcAddress (backport for 1.6)"
  • SPARK-15395 - [CORE] Use getHostString to create RpcAddress
  • SPARK-15528 - [SQL] Fix race condition in NumberConverter
  • SPARK-15541 - Casting ConcurrentHashMap to ConcurrentMap
  • SPARK-15541 - Casting ConcurrentHashMap to ConcurrentMap
  • SPARK-15601 - [CORE] CircularBuffer's toString() to print only the contents written if buffer isn't full
  • SPARK-15606 - [CORE] Use non-blocking removeExecutor call to avoid deadlocks
  • SPARK-15613 - [SQL] Fix incorrect days to millis conversion due to Daylight Saving Time
  • SPARK-15613 - Revert "[SQL] Fix incorrect days to millis conversion due to Daylight Saving Time"
  • SPARK-15613 - [SQL] Fix incorrect days to millis conversion due to Daylight Saving Time
  • SPARK-15723 - Fixed local-timezone-brittle test where short-timezone form "EST" is …
  • SPARK-15736 - [CORE][BRANCH-1.6] Gracefully handle loss of DiskStore files
  • SPARK-15761 - [MLLIB][PYSPARK] Load ipython when default python is Python3
  • SPARK-15827 - [BUILD] Publish Spark's forked sbt-pom-reader to Maven Central
  • SPARK-15891 - Make YARN logs less noisy
  • SPARK-15891 - [YARN] Clean up some logging in the YARN AM.
  • SPARK-15892 - [ML] Backport correctly merging AFTAggregators to branch 1.6
  • SPARK-15892 - Revert "[ML] Incorrectly merged AFTAggregator with zero total count"
  • SPARK-15892 - [ML] Incorrectly merged AFTAggregator with zero total count
  • SPARK-15915 - [SQL] Logical plans should use subqueries eliminated plan when override sameResult.
  • SPARK-15975 - Fix improper Popen retcode code handling in dev/run-tests
  • SPARK-16035 - [PYSPARK] Fix SparseVector parser assertion for end parenthesis
  • SPARK-16044 - [SQL] Backport input_file_name() for data source based on NewHadoopRDD to branch 1.6
  • SPARK-16077 - [PYSPARK] catch the exception from pickle.whichmodule()
  • SPARK-16078 - [SQL] Backport: from_utc_timestamp/to_utc_timestamp should not depends on local timezone
  • SPARK-16086 - [SQL] fix Python UDF without arguments
  • SPARK-16148 - [SCHEDULER] Allow for underscores in TaskLocation in the Executor ID
  • SPARK-16173 - [SQL] Can't join describe() of DataFrame in Scala 2.10
  • SPARK-16182 - [CORE] Utils.scala -- terminateProcess() should call Process.destroyForcibly() if and only if Process.destroy() fails
  • SPARK-16193 - [TESTS] Address flaky ExternalAppendOnlyMapSuite spilling tests
  • SPARK-16214 - [EXAMPLES] fix the denominator of SparkPi
  • SPARK-16230 - [CORE] CoarseGrainedExecutorBackend to self kill if there is an exception while creating an Executor
  • SPARK-16257 - [BUILD] Update spark_ec2.py to support Spark 1.6.2 and 1.6.3.
  • SPARK-16313 - [SQL][BRANCH-1.6] Spark should not silently drop exceptions in file listing
  • SPARK-16329 - [SQL][BACKPORT-1.6] Star Expansion over Table Containing No Column #14040
  • SPARK-16353 - [BUILD][DOC] Missing javadoc options for java unidoc
  • SPARK-16372 - Revert "[MLLIB] Retag RDD to tallSkinnyQR of RowMatrix"
  • SPARK-16372 - [MLLIB] Retag RDD to tallSkinnyQR of RowMatrix
  • SPARK-16375 - [WEB UI] Fixed misassigned var: numCompletedTasks was assigned to numSkippedTasks
  • SPARK-16385 - [CORE] Catch correct exception when calling method via reflection
  • SPARK-16409 - [SQL] regexp_extract with optional groups causes NPE
  • SPARK-16414 - [YARN] Fix bugs for "Can not get user config when calling SparkHadoopUtil.get.conf on yarn cluser mode"
  • SPARK-16440 - [MLLIB] Destroy broadcasted variables even on driver
  • SPARK-16440 - [MLLIB] Undeleted broadcast variables in Word2Vec causing OoM for long runs
  • SPARK-16488 - Fix codegen variable namespace collision in pmod and partitionBy
  • SPARK-16489 - [SQL] Guard against variable reuse mistakes in expression code generation
  • SPARK-16514 - [SQL] Fix various regex codegen bugs
  • SPARK-16533 - [CORE] resolve deadlocking in driver when executors die
  • SPARK-16656 - [SQL][BRANCH-1.6] Try to make CreateTableAsSelectSuite more stable
  • SPARK-16664 - [SQL] Fix persist call on Data frames with more than 200…
  • SPARK-16664 - Revert "[SQL] Fix persist call on Data frames with more than 200…"
  • SPARK-16664 - [SQL] Fix persist call on Data frames with more than 200…
  • SPARK-16751 - [HOTFIX] Also update hadoop-1 deps file to reflect derby 10.12.1.1 security fix
  • SPARK-16751 - Upgrade derby to 10.12.1.1
  • SPARK-16796 - [WEB UI] Visible passwords on Spark environment page
  • SPARK-16831 - Revert "[PYTHON] Fixed bug in CrossValidator.avgMetrics"
  • SPARK-16831 - [PYTHON] Fixed bug in CrossValidator.avgMetrics
  • SPARK-16873 - [CORE] Fix SpillReader NPE when spillFile has no data
  • SPARK-16925 - Master should call schedule() after all executor exit events, not only failures
  • SPARK-16930 - [YARN] Fix a couple of races in cluster app initialization.
  • SPARK-16939 - [SQL] Fix build error by using `Tuple1` explicitly in StringFunctionsSuite
  • SPARK-16956 - Make ApplicationState.MAX_NUM_RETRY configurable
  • SPARK-17003 - [BUILD][BRANCH-1.6] release-build.sh is missing hive-thriftserver for scala 2.11
  • SPARK-17027 - Revert "[ML] Avoid integer overflow in PolynomialExpansion.getPolySize"
  • SPARK-17027 - [ML] Avoid integer overflow in PolynomialExpansion.getPolySize
  • SPARK-17038 - [STREAMING] fix metrics retrieval source of 'lastReceivedBatch'
  • SPARK-17102 - [SQL] bypass UserDefinedGenerator for json format check
  • SPARK-17245 - [SQL][BRANCH-1.6] Do not rely on Hive's session state to retrieve HiveConf
  • SPARK-17316 - [CORE] Fix the 'ask' type parameter in 'removeExecutor'
  • SPARK-17316 - [CORE] Make CoarseGrainedSchedulerBackend.removeExecutor non-blocking
  • SPARK-17356 - [SQL][1.6] Fix out of memory issue when generating JSON for TreeNode
  • SPARK-17418 - Prevent kinesis-asl-assembly artifacts from being published
  • SPARK-17465 - [SPARK CORE] Inappropriate memory management in `org.apache.spark.storage.MemoryStore` may lead to memory leak
  • SPARK-17485 - Prevent failed remote reads of cached blocks from failing entire job
  • SPARK-17531 - [BACKPORT] Don't initialize Hive Listeners for the Execution Client
  • SPARK-17547 - Ensure temp shuffle data file is cleaned up after error
  • SPARK-17549 - [SQL] Only collect table size stat in driver for cached relation.
  • SPARK-17617 - [SQL] Remainder(%) expression.eval returns incorrect result on double value
  • SPARK-17618 - Fix invalid comparisons between UnsafeRow and other row formats
  • SPARK-17623 - [CORE] Clarify type of TaskEndReason with a failed task.
  • SPARK-17648 - [CORE] TaskScheduler really needs offers to be an IndexedSeq
  • SPARK-17649 - [CORE] Log how many Spark events got dropped in AsynchronousListenerBus
  • SPARK-17675 - [CORE] Expand Blacklist for TaskSets
  • SPARK-17721 - [MLLIB][BACKPORT] Fix for multiplying transposed SparseMatrix with SparseVector
  • SPARK-17850 - [CORE] Add a flag to ignore corrupt files
  • SPARK-17884 - [SQL] To resolve Null pointer exception when casting from empty string to interval type
  • SPARK-18117 - [CORE] Add test for TaskSetBlacklist
  • SPARK-18535 - Redact sensitive information from Spark logs and UI
  • SPARK-18546 - [CORE] Fix merging shuffle spills when using encryption
  • SPARK-18547 - [CORE] Propagate I/O encryption key when executors register
  • SQOOP-2983 - OraOop export has degraded performance with wide tables
  • SQOOP-3013 - Configuration "tmpjars" is not checked for empty strings before passing to MR
  • SQOOP-3028 - Include stack trace in the logging of exceptions in ExportTool
  • SQOOP-3034 - HBase import should fail fast if using anything other than as-textfile
  • SQOOP-3066 - Introduce an option + env variable to enable/disable SQOOP-2737 feature
  • SQOOP-3069 - Get OracleExportTest#testUpsertTestExport in line with SQOOP-3066
  • ZOOKEEPER-1576 - Zookeeper cluster - failed to connect to cluster if one of the provided IPs causes java.net.UnknownHostException
  • ZOOKEEPER-1917 - (Apache Zookeeper logs cleartext admin passwords) to address CVE-2014-0085
  • ZOOKEEPER-2402 - (Document client side properties)