What's New In CDH 5.9.x
What's New in CDH 5.9.2
This is a maintenance release that fixes some important issues. For details, see Issues Fixed in CDH 5.9.2.
What's New in CDH 5.9.1
This is a maintenance release that fixes some important issues. For details, see Issues Fixed in CDH 5.9.1.
What's New in CDH 5.9.0
- CDH 5.9 allows you to use temporary credentials to log in to Amazon S3. You can obtain temporary credentials from Amazon's Security Token Service (STS).
- A tool has been added--org.apache.hadoop.hbase.replication.regionserver.DumpReplicationQueues--to dump existing replication peers, configurations, and
queues when using HBase replication. The tool includes two flags:
- --distributed - Polls each replication server for information about the replication queues being processed on this replication server. By default, this is not enabled, and the information about the replication queues and configuration is obtained from ZooKeeper.
- --hdfs When --distributed is used, this flag attempts to calculate the total size of the WAL files used by the replication queues. Because multiple peers can be configured, this value can be overestimated.
For more information, see Class DumpReplicationQueues.
- Metrics have been added that expose the amount of replayed work occurring in the HBase replication system. For more information on these metrics, see Replication Metrics in the Apache HBase Reference Guide.
HUE-4039: Improves SQL Autocompleter. The new Autocompleter deeply understands Hive and Impala SQL dialects and provides smart suggestions based on your statement structure and cursor position. See how to manually Enable and Disable Autocompleter.
HUE-3877: Adds support for Amazon RDS. You can now deploy Hue against an Amazon RDS database instance with MySQL, PostgreSQL, and Oracle engines.
Rebase of Hue on upstream Hue 3.11.
Apache Impala (incubating)
[IMPALA-3206] Speedup for queries against DECIMAL columns in Avro tables. The code that parses DECIMAL values from Avro now uses native code generation.
[IMPALA-3674] Improved efficiency in LLVM code generation can reduce codegen time, especially for short queries.
[IMPALA-2979] Improvements to scheduling on worker nodes, enabled by the REPLICA_PREFERENCE query option. See REPLICA_PREFERENCE Query Option (CDH 5.9 or higher only) for details.
Improvements to the Impala web user interface:
[IMPALA-3499] Scalability improvements to the catalog server. Impala handles internal communication more efficiently for tables with large numbers of columns and partitions, where the size of the metadata exceeds 2 GiB.
[IMPALA-3677] You can send a SIGUSR1 signal to any Impala-related daemon to write a Breakpad minidump. For advanced troubleshooting, you can now produce a minidump without triggering a crash. See Breakpad Minidumps for Impala (CDH 5.8 or higher only) for details about the Breakpad minidump feature.
[IMPALA-3687] The schema reconciliation rules for Avro tables have changed slightly for CHAR and VARCHAR columns. Now, if the definition of such a column is changed in the Avro schema file, the column retains its CHAR or VARCHAR type as specified in the SQL definition, but the column name and comment from the Avro schema file take precedence. See Creating Avro Tables for details about column definitions in Avro tables.
[IMPALA-3575] Some network operations now have additional timeout and retry settings. The extra configuration helps avoid failed queries for transient network problems, to avoid hangs when a sender or receiver fails in the middle of a network transmission, and to make cancellation requests more reliable despite network issues.
Oozie adds a new database tool for migration and upgrade from Apache Derby (or any other supported database). For more information, see How to Use the New Apache Oozie Migration Tool.
- Sentry adds support for securing data on Amazon RDS. As a result, Sentry will now be able to secure URIs with an RDS schema.
- SENTRY-1233 - Logging improvements for SentryConfigToolSolr.
- SENTRY-1119 - Allow data engines to obtain the ActionFactory directly from the configuration, instead of having hardcoded component-specific classes. This will allow external data engines to integrate with Sentry easily.
- SENTRY-1229 - Added a basic configurable cache to SentryGenericProviderBackend.
- You can now set up AWS credentials for Spark with the Hadoop credential provider, to avoid exposing the AWS secret key in configuration files.
- The mainframe import module extension has been added to support data sets on tape.
- The Solr watchdog is now configured to use the fully qualified domain name (FQDN) of the host on which the Solr process is running (instead of 127.0.0.1). You can override this configuration by setting SOLR_HOSTNAME environment variable to appropriate value (before starting the Solr server).
- Cloudera Search adds support for index snapshots. For more information on how to back up, migrate, or restore your indexed data, see Backing Up and Restoring Cloudera Search.