New Features and Changes in Cloudera Manager 6.1.0

Accumulo

Accumulo installations now use the Hadoop Credential Provider to handle sensitive properties. For example, the instance secret and trace user password.

Agents

The Cloudera Manager Admin console now displays a message if the agent is hearbeating with an invalid CM_GUID

API endpoints for Roles

For API documentation and the new Swagger-based API client, roles can be accessed from Roles Resource instead of Services Resource. This does not change the roles endpoint and does not impact those accessing the Cloudera Manager API endpoints directly using tools like curl.

Audit Events

Cloudera Manager logs events in the Audits database table when the API is accessed either from the Cloudera Manager Admin Console or from any other client. When the API is accessed at a rapid rate, the Audits database table grows rapidly, negatively impacting Cloudera Manager performance.

Cloudera Manager now collects similar audit events that occur during a configurable period into a unique audit entry in the Audits database table. This can prevent the Audits table from being filled at a rapid rate. This feature can be configured by setting arguments to CMF_JAVA_OPTS in cloudera-scm-server.properties:
  • com.cloudera.cmf.persist.cmAuditTrackerConfig.timeToLiveMs : Period during which similar audit entries will be coalesced into one. Default is 10000 milliseconds. Setting this value to 0 disables this feature entirely
  • com.cloudera.cmf.persist.cmEventCoalescer.maxTrackedEvents: Number of maximum events that can be candidates for coalescing in a certain period. Default is 1024. If this limit is reached, then the oldest event is removed.

Auto-TLS

Certificate Handling

The certmanager can now use the following option to automatically skip invalid certificates and import the rest of the bundle: --skip-invalid-ca-certs. Previously, if one or more of the certificates in a bundle were invalid, then the entire setup operation failed.

Randomization of Sequential Certificate Authority Serial Numbers

Previously, certificates generated by Auto-TLS always started at serial number 0. Now, certificates will start from a random serial number. This affects only new deployments using Auto-TLS. Existing deployments using Auto-TLS are unaffected.

Supported Services

Auto-TLS now supports the following services: Flume, Java Keystore KMS, KeyTrustee server, KeyTrustee KMS, Thales HSM KMS, and Luna HSM KMS. When adding these services while Auto-TLS is enabled, TLS configuration will be added automatically.

Backup and Disaster Recovery (BDR)

Insecure Cluster to Secure Cluster Replication

You can now use BDR to replicate data from an insecure cluster that does not use Kerberos authentication, to a secure cluster that uses Kerberos. Note that the reverse is not true. BDR does not support replicating from a secure cluster to an insecure cluster.

To perform the replication, the destination cluster must be managed by Cloudera Manager 6.1.0 or higher. The source cluster must run Cloudera Manager 5.14.0 or higher in order to replicate to Cloudera Manager 6.

For more information, see Replicating from Insecure to Secure Clusters for Hive or Replicating from Insecure to Secure Clusters for HDFS.

Invalidate Metadata

BDR enhanced the Invalidate Metadata option so that the command is issued per Impala service after replication. This ensures that if a cluster has multiple Impala services, only the target Impala's metadata cash will be invalidated and require refresh, which can impact performance.

Kudu

BDR now ignores Hive tables backed by Kudu during replication. The change does not affect functionality since BDR does not support replicating Kudu tables. This change was made to guard against data loss due to how the Hive Mestastore, Imapla, and Kudu interact.

Log Retention

Previously, Cloudera Manager retained logs for BDR Replication jobs indefinitely. Now, Cloudera Manager retains BDR logs for 90 days by default. You can now change the number of days Cloudera Manager retains logs for or disable log retention completely with the Backup and Disaster Log Retention property of the HDFS Service.

Faster Incremental Replication using HDFS Snapshot-diff Report

This feature compares two HDFS snapshots to reduce the number of files scanned during the copy-listing phase of replication to only those files that have known changes between runs. This can speed up replication performance dramatically when large number of files are unchanged between replications.

This feature relies on the immutable snapshot feature of HDFS. This feature existed in prior releases of CDH, it is now on by default in 6.1. You can also configure replication jobs to abort on snapshot diff failure when you create or edit a replication schedule. This can happen if files that are in the scope of replication have been added, changed or deleted on the destination cluster, which is generally unsupported by BDR. However, BDR will fall back to an exhaustive comparison of files, and you can use various options for conflict resolution in this case, such as "delete policy".

See the following pages for guidelines on using snapshot diff-based replication: Hive Guidelines and HDFS Guidelines.

PostgreSQL 10 Support

Added support for PostgreSQL version 10 for databases Cloudera Manager uses to store configuration, monitoring, and reporting data and for managed services that require a database.

Cloudera Express License Enforcement

A Cloudera Express license is only valid when less than 100 hosts are used across an organization.

Note the following:
  • Cloudera Manager will not allow you to add hosts to a CDH 6.x cluster if the total number of hosts across all CDH 6.x clusters will exceed 100.
  • Cloudera Manager will not allow you to upgrade any cluster to CDH 6.x if the total number of managed CDH6.x cluster hosts will exceed 100. If an upgrade from Cloudera Manager 6.0 to 6.1 fails due to this limitation, you must downgrade Cloudera Manager to version 6.0, remove some hosts so that the number of hosts is less than 100, then retry the upgrade.

Affected Versions:Cloudera Manager 6.1 and higher

Diagnostic Bundles

The diagnostic bundle has been improved in the following ways:

  • The dmesg host command output collected as part of diagnostic bundles now includes formatted timestamps, if the host operating system supports it.
  • Diagnostic bundles now capture information about all network interfaces on each host, regardless of name.

HBase CDH 5 to CDH 6 upgrade checks for hbase prefix_tree_encoding

Added an upgrade check for upgrades from CDH 5 to CDH 6 that checks whether HBase tables are using PREFIX_TREE_ENCODING and warns the user.

Cloudera Bug: OPSAPS-44701

HDFS

You can now configure nfs.export.point as part of the HDFS configurations for NFSGateway.

Hive

Size of Hive query locks in ZooKeeper

When taking locks on a table, Hive creates a Zookeeper object for each such lock which contains the full query string. This query string is only used to display locks with SHOW LOCKS EXTENDED command. It has no impact on the actual locking process.

However, this often created huge memory pressure on the ZooKeeper instance. For example, for a query string of 1MB in size, if the locks are acquired on 10000 partitions of a table, then this requires 10GB of memory on ZooKeeper. To alleviate this pressure, the maximum query length stored in ZooKeper lock object has been limited to 10000 characters by default via hive.locks.query.string.max.length property. To reiterate, this does not affect any behavior except for how queries are displayed in the output of the SHOW LOCKS EXTENDED command. This configuration value can be increased to a maximum of 1 million, which is the data limit of a znode (1 MB).

Hive Metastore Connection Retries

A new configuration parameter, hive.metastore.connect.retries, has been added for HiveServer2 with an increased default value.

Hue

For RedHat7 and compatible platforms, if Hue uses Postgres (including the Cloudera Manager embedded database for proof-of-concept installations), the appropriate version of psycopg2 will be automatically installed by Cloudera Manager.

Hue Logs

Cloudera Manager can now parse httpd log files, including those used by Hue, meaning they will be included in diagnostic bundles, log search, and visible for browsing in the Cloudera Manager UI.

Impala

New Impala configuration parameters for idle query timeout and idle session timeout

Cloudera Manager now supports configuring the Impala idle_query_timeout and idle_session_timeout parameters.

New Impala daemon configuration property for JVM heap size

A new Impala Daemon configuration parameter, Java Heap Size of Impala Daemon in Bytes has been added to configure the JVM heap size. It defaults to 4 GB, and, like all memory parameters, may require tuning.

Impalad JVM usage plots are now on the Impala Daemon's role status page

Impala Daemon's JVM Heap Usage plots are now available on the Impala Daemon's Status page on the Cloudera Manager Admin Console.

Cloudera Bug: OPSAPS-47832

Impala Metrics

Impala exposes additional metrics about the JVM and GC now. GC metric charts for the Impala Daemon's embedded JVM will now be seen on the Impala Daemon's role status page in the Cloudera Manager Admin Console.

Impala Health Checks

Added two new health checks:
  • JVM pause time
  • Maximum capacity for concurrent client connections for the Impala Daemon. You can configure this health check with the Impala Daemon Max Client Connections parameter.

Impala Chart Library

The Impala predefined charts have been updated to include more meaningful metrics and remove rarely used plots.

Impala Resource Pools

Impala resource pools now contain minimum/maximum allowed memory limit (MEM_LIMIT) values for queries submitted to a particular pool. This change also adds validations for those attributes. For more information about these attributes, see IMPALA-7349.

Intel's MKL Repository

The Intel Math Kernel Library (MKL) parcel is now included in the default parcel repositories starting in Cloudera Manager 6.1. This parcel can accelerate certain machine learning workloads. The parcel is available, but not downloaded or activated on clusters by default. Read more about it here: https://software.intel.com/en-us/articles/installing-intel-mkl-cloudera-cdh-parcel

Kafka

Kafka Data Retention Parameter

The Kafka Broker parameter Data Retention Hours (data.retention.hours) was removed from the Cloudera Manager Admin Console. Use the Data Retention Time (data.retention.ms) parameter instead.

Kafka Broker Network Threads Parameter

A new configuration property, num.network.threads has been added to the Kafka broker configuration parameters. The default value is based on the upstream version.

Kafka Broker Performance Defaults

For CDH 6.1 and higher installations, default values for the following configuration parameters of the Kafka service have been changed based on production recommendations: - num.replica.fetchers=4 and num.network.threads=8

Kafka Metrics

The following metrics have been added for BrokerTopic:
  • kafka_fetch_message_conversions_per_sec
  • kafka_produce_message_conversions_per_sec
  • kafka_replication_bytes_in_per_sec
  • kafka_replication_bytes_out_per_sec
  • kafka_total_fetch_requests_per_sec
  • kafka_total_produce_requests_per_sec
The following metrics have been added for the Controller:
  • kafka_auto_leader_balance_rate_and_time_ms
  • kafka_controlled_shutdown_rate_and_time_ms
  • kafka_controller_change_rate_and_time_ms
  • kafka_isr_change_rate_and_time_ms
  • kafka_leader_and_isr_response_received_rate_and_time_ms
  • kafka_log_dir_change_rate_and_time_ms
  • kafka_manual_leader_balance_rate_and_time_ms
  • kafka_partition_reassignment_rate_and_time_ms
  • kafka_topic_change_rate_and_time_ms
  • kafka_topic_deletion_rate_and_time_ms
  • kafka_controller_state
  • kafka_global_partition_count
  • kafka_global_topic_count
The following metrics have been added for the ReplicaManager:
  • kafka_failed_isr_updates
  • kafka_offline_replica_count
  • kafka_under_min_isr_partition_count
The following metrics have been added for the LogCleaner:
  • kafka_logcleaner_cleaner_recopy_percent
  • kafka_logcleaner_max_buffer_utilization_percent
  • kafka_logcleaner_max_clean_time_secs
  • kafka_logcleaner_max_dirty_percent
  • kafka_logcleaner_time_since_last_run_ms
  • kafka_logcleaner_offline_log_directory_count

Kafka Shutdown and Recovery

The graceful stop timeout of the Kafka service has been increased to 120 seconds, and a new configuration property, num.recovery.threads.per.data.dir has been added.

JBOD-related metrics

New metrics have been added that show the number of offline log directories and offline partitions in Kafka

Improved Redaction of Kerberos Credentials

Enhanced the behavior of the Import KDC Account Manager Credentials command. If the command fails, the currently configured redaction policy is now applied to the command's error output. User names and passwords are always redacted from the output.

New cluster Metrics for Cloud storage

The amount of data read and written through S3 and Azure Data Lake storage by MapReduce jobs can now be viewed as cluster metrics. For example: s3a_bytes_read and adl_bytes_written.

Cloudera Bug: OPSAPS-44748

Network Performance Inspector

The Network Performance Inspector allows you to examine the latency among the hosts managed by Cloudera Manager. You can use this tool to diagnose latency issues that can significantly affect the performance of workloads such as MapReduce jobs, Spark jobs, and Hive and Impala queries, particularly when using remote storage.

The inspector runs ping commands from each host to all other hosts, and reports the average ping time and packet loss percentage. You can use this information to identify problematic hosts or networking infrastructure issues so that you take corrective action. You can run the inspector on-demand, and it also available when adding a new cluster. You can also run the inspector using the Cloudera Manager API.

See Inspecting Network Performance

OpenJDK

OpenJDK is now supported for Cloudera Manager and CDH 6.1 and higher.

For more information, see Java Requirements and Manually Migrating from Oracle JDK to OpenJDK.

Sentry

  • A new Sentry configuration has been added to the SENTRY configuration. This enables Sentry OWNER privileges and is disabled by default.
  • A new Sentry configuration for OWNER Privileges is added with ALL_WITH_GRANT as the default

YARN Fair Scheduler Properties

Two existing YARN configuration parameters have now been exposed in Cloudera Manager. Fair Scheduler Dynamic Max Assign has been added, which allows the ResourceManager to allocate up to half the available resources on a node during node heartbeat, as long as the Fair Scheduler Assign Multiple Tasks setting is true. The default value is true. Also, the Fair Scheduler Max Assign property has also been added, which sets the number of containers allocated by the ResourceManager with each node heartbeat, as long as Fair Scheduler Assign Multiple Tasks is true and Fair Scheduler Dynamic Max Assign is false. The default value is -1 which is equivalent to unlimited. These changes should not have any effect on YARN behavior as they are just being shown in Cloudera Manager and the default values are unchanged.

System User Group Membership

The Host inspector will now display a warning if various Linux system users (e.g. 'yarn','hdfs','hue','sentry') are not members of a group of the same name, which is required, particularly when Kerberos authentication is enabled. For more information, see Hadoop Users (user:group) and Kerberos Principals.

TLS

You can now set the TLS cipher suites for Hadoop with the ssl.server.exclude.cipher.list property.

ZooKeeper

Enable Kerberos Authentication and Enable Server to Server SASL Authentication settings in ZooKeeper have been linked together since both should be either turned on or off. If either is switched on or off, the other automatically follows.

This change automates steps that were manually required to address CVE-2018-8012 .