New Features and Changes in Cloudera Manager 5

The following sections describe what's new and changed in each Cloudera Manager 5 release.

What's New in Cloudera Manager 5

The following sections describe what is new in each Cloudera Manager 5 release.

What's New in Cloudera Manager 5.13.1

Cloudera Manager 5.13.1 is a maintenance release with many fixed issues. See Issues Fixed in Cloudera Manager 5.13.1.

What's New in Cloudera Manager 5.13.0

  • Cloudera Data Science Workbench

    Cloudera Data Science Workbench is now available as an add-on service for Cloudera Manager. To this end, Cloudera Data Science Workbench is now distributed in a parcel that integrates with Cloudera Manager using a Custom Service Descriptor (CSD). You can use Cloudera Manager to install, upgrade, and monitor Cloudera Data Science Workbench. Additionally, diagnostic data bundles for Cloudera Data Science Workbench can be generated and submitted to Cloudera through Cloudera Manager.

  • Dashboard User Role

    Users assigned the Dashboard User can perform the following actions:

    • Create, edit, or remove their own dashboards
    • Create new charts or add existing charts to their own dashboards
    • View data in Cloudera Manager
    • View service and monitoring information
  • Impala Query Profiles Time Display

    Impala query profiles downloaded from Cloudera Manager in text format now include a human-readable version for the value of each profile counter instead of the raw value. For example, for time counters before CM 5.13 only the raw nanoseconds value was shown: TotalTime: 492626971556. Now a human-readable value is shown alongside the raw value: TotalTime: 8.2m (492626971556).

  • Sentry High Availability

    The Sentry Service now supports High Availability. See Sentry High Availability.

  • Licensing Management Improvements
    • License Information

      A banner now indicates when there are 60, 30, 14, or 0 days left on a license.

    • Downgrading from Cloudera Enterprise to Cloudera Express

      Previously, downgrading from Enterprise to Express required editing the Cloudera Manager database. Now this task can be performed by the user by using the Downgrade button on the license page.

  • Enable Kerberos Wizard shows warning for hostnames with uppercase letters

    Because Kerberos principal names cannot include upper-case letters, the Enable Kerberos wizard welcome screen will show a warning if any hostnames containing uppercase characters are detected. However, the user will be able to continue with the wizard regardless of this warning. When the warning is shown, up to 10 such detected hostnames will be listed. If the list is longer than 10, the message also says only 10 are shown.

  • New Validations for Hadoop Configuration Properties User impersonation
    A new validation has been added for the various username configuration properties used by the Service Monitor to ensure that the properties are valid Linux usernames. The properties affected by this check are:
    • HDFS User to Impersonate - HDFS service
    • HBase User to Impersonate - HBase service
    • MapReduce User to Impersonate - MapReduce service
    • YARN Container Usage MapReduce Job User - YARN service

    If these properties are not valid, the respective services will not start.

  • Additions to the Cloudera Manager API
  • New Guardrails for Operating Cloudera Manager at Scale
    • New Validator for Management Roles

      A new validator warns when if multiple Management roles are running on the same host when Cloudera Manager manages more than 80 hosts.

    • New Validators for Service Monitor and Host Monitor Memory Allocations

      New validators have been added for Service and Host Monitor that give warnings if the heap and non-java memory for these roles are below the recommended values for the cluster. These validators depend on the size of the clusters, as well as on the types of services running in the cluster.

  • Solr Chart Library

    Chart Library for Solr now contains example charts for every metric available for the Solr Service.

  • Improved BDR Performance

    BDR replication performance has been improved by running the first phase of replication on the source cluster (the copy-listing phase, which creates a list of files and folders to be copied). This can dramatically improve performance in scenarios where there is high latency between the source and destination clusters. This feature requires Cloudera Manager 5.13 or higher on both the source and target cluster and can be disabled by setting a feature flag with the API.

  • New Configuration Property for Descriptors

    A new property, scm.server.proxy.timeout, has been added for configuring the Descriptor fetch timeout in the Cloudera Manager Admin Console. This is useful when tuning Cloudera Manager for very large deplolyments. Previously, this value was configured at the service level in various Advanced configuration snippets.

    You can find the property by navigating to Administration > Settings.

    Cloudera Bug: OPSAPS-41578

  • CSD Health Reporting

    Added support for determining the health of a CSD service based on the health of its roles. For more information, see the Health Aggregation section in the CSD Documentation.

  • New Validator for Banned YARN Users

    A new validator has been added to ensure that the Banned System Users list is the same across all YARN NodeManagers when using Kerberos authentication. The YARN service does not start if the validation fails. To see the list of banned users, select the YARN cluster and navigate to Configuration. Search for the banned.users property.

  • Resume Rolling Upgrade

    When running a Rolling Restart as part of an upgrade, you can resume the rolling restart after fixing problems that caused the upgrade to fail because one or more hosts did not successfully restart. After you fix the problems you can now resume the rolling restart and Cloudera Manager will skip restarting roles that have already successfully restarted. This change speeds up retrying rolling restarts for large clusters.

  • Diagnostic Bundles Collection for Upgrade Failures

    During a cluster upgrade, if there is a failure, Cloudera Manager now allows you to send a diagnostic bundle to Cloudera support. The Upgrade Wizard opens the Send Diagnostic Data dialog box with the current cluster name and time duration pre-populated.

  • New Impala metrics for hedged reads, JVM heap usage and connection setup queue size

    New metrics have been added JVM Heap usage of the Catalog Server and Hedged reads.

  • Support for Sentry with a highly available Hive Metastore

    It is now possible to use HDFS Sentry Sync when running a Hive Metastore using high availability.

  • New placement rules for CSD Services

    A new placement rule for CSD-based services has been added to the Service Descriptor Language called alwaysWithAny. When this rule is present, the specified role must always be placed on the same host where the roles specified in the rule are placed. The specified role no longer appears in the wizard when adding this service. Instead, one instance of this role is automatically placed on any host that has at least one of the primary roles. If more than one of the primary roles are themselves placed in the same host, then only one instance of this role is automatically placed on that host. There should be at least two unique primary roles defined in the alwaysWithAny rule. And, the alwaysWithAny rule is mutually exclusive to the alwaysWith rule and they should not be defined together for the same role. If a user assigns roles in a way that violates this placement rule, the service shows a configuration error and fails to start. See the Service Descriptor Language Reference.

  • Navigator search and tagging are now on by default

    The Navigator search and tagging features of Hue are now enabled by default when adding a new Hue service to a cluster running CDH 5.12 or higher.

  • Add protocol, accept_count, and acceptor_thread_count parameters to LUNA_KMS and THALES_KMS CSDs
    New performance tuning parameters related to Tomcat have been exposed within the KMS services designed to work with the Luna and Thales Hardware Security Modules (HSM). These parameters only take effect when running CDH5.12.1 and higher. The following parameters were added:
    • protocol
    • accept_count
    • acceptor_thread_count

What's New in Cloudera Manager 5.12.1

Cloudera Manager 5.12.1 is a maintenance release with many fixed issues. See Issues Fixed in Cloudera Manager 5.12.1.

What's New in Cloudera Manager 5.12.0

  • Backup and Disaster Recovery
    • Refreshing Impala metadata during replication

      You can now use an option in the Cloudera Manager Admin Console to configure BDR to automatically refresh Impala’s metadata cache in the destination cluster during replication. Previously, this feature required an Advanced Configuration Snippet (Safety Valve). See Invalidating Impala Metadata.

    • Automatic renewal of Kerberos tickets and Delegation Tokens

      Previously, BDR replication jobs would fail on a Kerberized cluster if the job duration was longer than the renewal interval for the HDFS delegation token. With this fix, both the delegation token and Kerberos ticket are renewed until the max lifetime of token/ticket (default value is 7 days). This enables longer replications without needing to bring down the source cluster to change the ticket timeout.

    • Streamlined Kerberos Configuration
      • As part of Test Connectivity for peers, Cloudera Manager now tests for properly configured Kerberos authentication on the source and destination clusters. Test Connectivity runs automatically when you add a peer for replication, or you can manually initiate Test Connectivity from the Actions menu. This feature is available when the source and destination clusters run Cloudera Manager 5.12 or higher. See Enabling Replication Between Clusters with Kerberos Authentication.
      • If Cloudera Manager is managing the Kerberos configuration (krb5.conf) for your clusters, BDR can automatically make some required changes to your Kerberos configuration based on issues found during the Test Connectivity action.
      • The configuration process for adding peers when using Kerberized clusters is simplified if both the source and target clusters use Cloudera Manager 5.12 or later. Now, you only need to setup trust on the target cluster and not the source, reducing the complexity of enabling Hive Replication. See Enabling Replication Between Clusters with Kerberos Authentication
    • Add a name and description to replication schedules

      When you create or edit a replication schedule, you can add a name on the General tab and add a description on the Advanced tab.

  • Hive Metastore Schema Integrity Checker

    Cloudera Manager now uses the Hive Metastore schemaTool for validating the integrity of Hive metadata. When you upgrade a cluster that contains a Hive Service to CDH 5.12 or higher using the Cloudera Manager Upgrade Wizard or command line, before upgrading the Hive metastore schema, Cloudera Manager first runs a validation check to detect any corruption. If the validation check fails, Cloudera Manager displays the error and stops the upgrade. Corruption issues should be resolved before proceeding with the upgrade.

  • Support for HSM Key Provider

    The HDFS Encryption Wizard in Cloudera Manager now supports configuration of the Hardware Security Module (HSM) Key Providers supported by CDH 5.12 for encryption key management.

  • Sending Diagnostic Bundles

    The user interface in the Cloudera Manager Admin Console for collecting and sending diagnostic bundles has been improved. Regardless of how diagnostic data collection is configured before you start, each time you create a bundle, you can now select one of the following options: Collect and Upload Diagnostic Data to Cloudera Support or Collect Diagnostics Data only. Additionally, the Cloudera Manager Admin console better indicates the status of the bundle. For example, showing whether or not the bundle was successfully sent to Cloudera.

  • Delete Kerberos Service Principals

    You can now delete MIT Kerberos or Active Directory Service Principals that were previously created by Cloudera Manager while Kerberizing a cluster using the delete_credentials API.

  • HBase Region in Transition Health Check

    Cloudera Manager now performs a health check to detect whether HBase regions have become stuck in transition during splitting and merging operations.

  • Replication factor for MapReduce job submission files

    New auto-configuration logic for MR1 and MR2's Submit Replication Factor property attempts to choose a value that is at least the value of the HDFS Replication Factor for clusters with three or more DataNodes. Additionally, a new configuration validator raises a configuration warning if the existing Submit Replication Factor is lower than the HDFS Replication Factor if the cluster has at least 3 DataNodes.

  • Custom Header Color

    You can customize the header color that Cloudera Manager displays in the web UI. Select Administration > Settings. Select Other for the Category and use the drop-down menu for Custom Header Color.

  • Dynamic Resource Pools UI

    The Dynamic Resource Pools user interface now displays Access Control information about resource pools, showing whether they are freely usable, restricted to a custom set of users/groups, or inherit ACLs from their parent pool.

  • Example Impala Shell Command

    The Impala Service Status Page now includes an example Impala Shell Command.

  • Configurable S3 Endpoint

    The S3 Connector service now allows you to configure the default S3 endpoint used by HDFS clients (including Hive and Impala), ensuring all S3 data created/accessed by your cluster is (by default) stored in the AWS region of your choice. Additionally, Hue is configured to automatically use the default endpoint as the S3 Connector.

  • Solr
    • Request Rate and Index Size Charts

      The graphs on the Solr status page now include the request rates against the service and the aggregate size of the indices.

    • New Tags in Solr Logs

      The logging for Solr has been improved. Logs now include the following IDs: thread, shard, replica, and collection.

What's New in Cloudera Manager 5.11.2

Cloudera Manager 5.11.2 is a maintenance release with many fixed issues. See Issues Fixed in Cloudera Manager 5.11.2.

  • New Tags in Solr Logs

    The logging for Solr has been improved. Logs now include the following IDs: thread, shard, replica, and collection.

What's New in Cloudera Manager 5.11.1

Cloudera Manager 5.11.1 is a maintenance release with many fixed issues. See Issues Fixed in Cloudera Manager 5.11.1.

What's New in Cloudera Manager 5.11.0

  • Amazon S3
    • Amazon S3 Consistency with Metadata Caching (S3Guard)

      Data written to Amazon S3 buckets is subject to the "eventual consistency" guarantee provided by S3, which means that data written to S3 may not be immediately available for queries and listing operations. This can cause failures in multi-step ETL workflows, where data from a previous step is not available to the next step. To mitigate these consistency issues you can now configure metadata caching for data stored in Amazon S3 using S3Guard. Some workloads that access S3 may also see modest performance improvements with metadata caching. S3Guard requires that you provision a DynamoDB database from Amazon Web Services and configure S3Guard using the Cloudera Manager Admin Console or command-line tools. See Configuring and Managing S3Guard.

  • Operating System Support
    • SLES 12 SP2 Support

      SLES 12, SP2 is now supported as of Cloudera Manager and CDH 5.11 and higher.

    • Mixed Operating system support for gateway hosts running Cloudera Data Science Workbench

      A Gateway host that is dedicated to running Cloudera Data Science Workbench can use RHEL/CentOS 7.2 even if the remaining hosts in your cluster are running any of the other supported operating systems. All hosts must run the same version of the Oracle JDK.

  • Backup and Disaster Recovery
    • Refreshing Impala metadata during replication

      You can now configure Hive/Impala replication jobs to run the INVALIDATE METADATA Impala statement in the destination cluster automatically at the end of the replication process, allowing newly replicated data to be immediately queried by Impala. See Invalidating Impala Metadata.

    • Hive Replication to Amazon S3 now supported for regions that support only Signature Version 4 signing protocol
      Replications from Hive or HBase to Amazon S3 are now supported for S3 regions that only support Amazon's Signature Version 4 signing protocol. You must add the fs.s3a.endpoint property to the Cluster-wide Advanced Configuration Snippet (Safety Valve) for core-site.xml property and set its value to the Amazon S3 region. For example:
      <property>
      <name>fs.s3a.endpoint</name>
      <value>s3.us-east-2.amazonaws.com</value>
      </property>

      You can access this property in Cloudera Manager at Home > Configuration > Advanced Configuration Snippets.

  • Peak Memory Usage Filter now tracked per container for YARN applications

    Peak container memory usage is now tracked for YARN applications and new filter attribute, Used Memory Max has been added for monitoring YARN applications.

  • Improved Kerberos-Encryption-Type Handling by Cloudera Manager

    Cloudera Manager validates the Kerberos encryption type as it is being entered into the Cloudera Manager Admin Console, and displays an error message if the type is not a valid MIT or Microsoft Active Directory (Kerberos) encryption type. Administrators can disable the feature when necessary—for example, if new encryption types added to Kerberos are ahead of the encryption types supported by Cloudera Manager (invalid encryption types fail, regardless of warning message display).

  • Enabling SPNEGO authentication for Hue

    Enabling the Hue Authentication Backend property (for SPNEGO) now automatically adds all necessary environments and kerberos credentials. Previously, you needed to follow this procedure: Enabling SPNEGO as an Authentication Backend for Hue.

  • New and Changed Configuration
    • Auto-configuration of HBase Thrift Authentication when Kerberos is enabled

      When Kerberos is enabled for the cluster, the value of the HBase configuration parameter HBase Thrift Authentication is automatically set to auth-conf. Clusters that already have Kerberos enabled will not have this setting changed when upgrading Cloudera Manager, this only affects enabling Kerberos in the future.

    • New HDFS NameNode configuration property for deleting the trash

      A new HDFS NameNode property, Filesystem Trash Checkpoint Interval (fs.trash.checkpoint.interval) has been introduced with a default value of 1 hour. This property causes the NameNode better respect and accurately enforce the configured HDFS trash deletion interval set with the Filesystem Trash Interval property (fs.trash.interval).

      The old behaviour without this property accidentally caused many files in the HDFS trash to be deleted only when twice the desired trash deletion interval had transpired because the checkpoint interval matched the deletion interval. If the older implicit behaviour of retaining trash files for a longer time is desired, consider raising the value of theFilesystem Trash Interval property to a more suitable value. When upgrading to this version of Cloudera Manager or when changing this property, all HDFS NameNode role instances will be marked stale, and you must restart the HDFS NameNode role instances and their dependent services for this change to take effect.

    • New Auto Logout Timeout property for Hue

      A new configuration property has been added for the Hue service. The Auto Logout Timeout property controls how long the Hue browser can remain idle before automatically logging out the user. Set the property to -1 to disable automatic logout. To configure the property, go to the Hue service, select theConfiguration tab and search for the property.

    • New performance tuning properties for Key Management Server (KMS)
      The following new properties have been added for tuning the performance of the KMS service:
      • KMS Accept Count
      • KMS Handler Protocol
      • KMS Acceptor Thread Count
  • New API endpoint for refreshing parcel information
    A new REST API endpoint has been added to refresh parcel information from both local and remote repositories. The endpoint URL is:
    /api/v16/cm/commands/refreshParcelRepos

    See Cloudera Manager API documentation

  • New Metrics and Health Tests for Service Monitor and Host Monitor metric collection

    A new metric, mgmt_aggregation_run_duration, has been added to the Service Monitor and Host Monitor metrics to indicate how much time it takes to store metrics collected in last minute. This metric can be used to determine if more heap or non-heap memory is needed for these roles.

    See Cloudera Manager Metrics.

    New Health Tests, Host Monitor Metrics Aggregation Run Duration Test and Service Monitor Metrics Aggregation Run Duration Test have also been added to detect potential resource configuration issues with service monitor and host monitor.

    See Cloudera Manager Health Tests.

  • New validation for YARN NodeManger log directory

    Cloudera Manager now validates whether all YARN NodeManagers are storing logs in the same distributed file system directory so that no logs are missing from Job History Server. If NodeManagers have different configuration values, there will be a configuration error after upgrading Cloudera Manager to 5.11.

  • New Dynamic Resource Pool option

    Configuration of Nested User Pools (except existingSecondaryGroup) now includes a Create pool if it does not exist checkbox to indicate whether to create a sub-pool.

What's New in Cloudera Manager 5.10.2

Cloudera Manager 5.10.2 is a maintenance release with many fixed issues. See Issues Fixed in Cloudera Manager 5.10.2.

What's New in Cloudera Manager 5.10.1

Cloudera Manager 5.10.1 is a maintenance release with many fixed issues. See Issues Fixed in Cloudera Manager 5.10.1.

What's New in Cloudera Manager 5.10.0

  • Backup and Disaster Recovery
    • Change of default behavior for Impala metadata setting in Hive replications

      There are now three options for configuring replication of Impala metadata for Hive replication jobs:
      • No – Impala metadata is not replicated.
      • Yes – Impala metadata is replicated.
      • Auto – Cloudera Manager decides the value for this option based on the version of CDH in your cluster.

      In Cloudera Manager 5.9, when creating a Hive replication schedule, the Replicate Impala Metadata option was not selected by default (false). In Cloudera Manager 5.10, the value defaults to Auto, so that Cloudera Manager decides what the value should be.

    • Replication of Impala Cached Column statics

      Table and partition-level column statistics stored in the Hive metastore and used by Impala are now replicated during Hive Replication. This is supported between a replication source with Cloudera Manager running version 5.10 or higher and a replication target running Cloudera Manager 5.10 or higher. Because this change replicates more information, the same schedule may take more time to complete if column statistics are present.

    • Performance summaries of HDFS and Hive replication jobs

      You can download full performance reports for HDFS and Hive replications from the Replication Schedules page and from the Replication History page. You can also filter the output by error, deleted, or skipped status. See Monitoring the Performance of Hive/Impala Replications and Monitoring the Performance of HDFS Replications.

    • HDFS replication performance monitoring now reports an initial sample

      HDFS performance reports now display early samples during the start of a replication job to give users earlier information about the progress of the job.

    • Scheduler Pool field for replications moved to Resources tab

      When creating replication schedules, the Scheduler Pool input field has been moved to the Resources tab.

  • Amazon S3
    • Amazon S3 Object Store

      You can configure auxiliary storage in your cluster using Amazon Simple Storage Service (S3). Client applications such as YARN, MapReduce, or Spark can access data stored in Amazon S3 using Amazon S3 URLs and credentials. See Configuring the Amazon S3 Connector.

    • Amazon S3 Server-side Encryption (SSE-S3)

      Clusters that use Amazon S3 storage can encrypt data using Amazon server-side encryption (SSE-S3). Use Cloudera Manager Admin Console to configure the cluster to use this new feature as detailed in How to Configure Encryption for Amazon S3.

    • Auto-configuration of Hue with AWS credentials (S3 access)

      Cloudera Manager now automatically configures S3 credentials for Hue in the hue.ini file.

  • Resource Management
    • maxResources per user now configurable in YARN Dynamic Resource Pools

      YARN Dynamic Resource Pools now support default capacity limits which automatically apply to all child pools of a resource pool. Using these limits, you can now effectively control the YARN resources available to any user or group in the cluster by configuring these settings on a parent pool and then using placement rules to auto-create child pools per user or group.

  • Key Management Service (KMS)
    • Java Option configuration added for KMS and Key Trustee KMS

      Additional Java Configuration Options can now be set for the KMS and Key Trustee KMS using Advanced Configuration Snippets. This allows customized Java configuration options for the KMS process.

    • Revised ACLs for KMS

      The ACL set generated by the KMS installation wizard has been updated to implement the recommended secure ACL policies for common key names. For more information, See Configuring KMS Access Control Lists Using Cloudera Manager.

  • Security
    • Redaction of stdout and stderr logs

      Redaction logic now applies to the output streams of a service to stdout and stderr logs to redact sensitive information from these files.

    • Encryption of Cloudera Manager database password

      By default the Cloudera Manager database password is stored as clear text in the /etc/cloudera-scm-server/db.properties file. You can now specify a program to execute whose standard output is the password. If com.cloudera.cmf.db.password is not found in /etc/cloudera-scm-server/db.properties, then the property com.cloudera.cmf.db.password_script, if it exists, is used. The value of this property is executed as a program, and the value returned to stdout is used as the password.

  • Monitoring
    • Collection of Metrics can be disabled for specified roles

      A new configuration has been introduced at the role level to disable monitoring by the Cloudera Manager Agent for each individual role. By default, monitoring is enabled for all roles. Once disabled, you must restart the role to make the change take effect. This can help in scenarios where the Cloudera Manager Service Monitor is experiencing performance issues and does not have enough resources. See Disabling Metrics for Specific Roles.

    • New chart display number of monitored entities

      A new chart displays on the Cloudera Management Service status page that shows the number of monitored entities by service monitor and host monitor.

    • Default timeouts configurable for Service Monitor, Host Monitor, Reports Manager, Events Server, and Activity Monitor

      The default timeout is now configurable for Service Monitor, Host Monitor, Reports Manager, Events Server, and Activity Monitor to register themselves to a cluster. Users can change the timeout by increasing the values of the Descriptor Fetch Max Tries property (the default value is 5) and the Descriptor Fetch Tries Intervalproperty (the default value is 2).

    • New Health Test for Service Monitor heap usage.

      A health test has been added to monitor heap usage of the Service Monitor. This test can be useful when diagnosing out of memory errors in the Service Monitor.

  • Users can send feedback on Cloudera Manager

    You can go to Support > Feedback to send feedback to Cloudera about Cloudera Manager.

  • Suppression of warning about embedded database for Cloudera Manager.

    When Cloudera Manager is configured to use the embedded PostgreSQL data base, it displays a banner warning that the embedded PostgreSQL database is not recommended for production environments. You can now suppress this banner: Go to Administration > Settings > Enable Embedded Database Check .

  • New HDFS balancer related configuration options

    HDFS balancer can now be configured to specify which hosts are included and excluded or which hosts are used as sources for transferring replicas. Additional properties for tuning the performance of the balancer can now also be configured starting for CDH 5.10.0 and higher.

  • Removing service dependencies

    Previously, when the user tries to delete a service that another service depends on, Cloudera Manager displays a dialog box explaining that the dependent service should be deleted. There is now a link that takes user to the configuration page where they can remove this dependency.

  • HBase topology files now deployed in client configurations

    When the HBase property Enable Replication To Secondary Region Replicas is enabled, the topology.py and topology.mapp files are deployed with the client configuration and the Topology Script File Name property is set to the deployed topology.py path.

  • The length of Impala queries retained by the Service Monitor is now configurable.

    The maximum size of impala queries is now configurable and the limit the default size to 10k chars.

What's New in Cloudera Manager 5.9.3

Cloudera Manager 5.9.3 is a maintenance release with many fixed issues. See Issues Fixed in Cloudera Manager 5.9.3.

What's New in Cloudera Manager 5.9.2

Cloudera Manager 5.9.2 is a maintenance release with many fixed issues. See Issues Fixed in Cloudera Manager 5.9.2.

What's New in Cloudera Manager 5.9.1

Cloudera Manager 5.9.1 is a maintenance release with many fixed issues. See Issues Fixed in Cloudera Manager 5.9.1.

What's New in Cloudera Manager 5.9.0

  • Creating Virtual Machine Images

    Documentation has been added with procedures to create virtual images of Cloudera Manager and cluster hosts. See Creating Virtual Images of Cluster Hosts.

  • Security
    • External/Cloud account configuration in Cloudera Manager

      Account configuration for access to Amazon Web Services is now available through the centralized UI menu External Accounts.

    • Key Trustee Server rolling restart

      Key Trustee Server now supports rolling restart.

  • Backup and Disaster Recovery
    • You can now replicate HDFS files and Hive data to and from an Amazon S3 instance. See HDFS Replication To and From Amazon S3 and Hive/Impala Replication To and From Amazon S3.
    • There are some new tuning options to improve performance of HDFS replication. See HDFS Replication Tuning.
    • You can now download performance data about HDFS replication jobs from the Replication Schedules and Replication History pages. See Monitoring the Performance of HDFS Replications.
    • Hive replication now stores Hive UDFs in the Hive metastore. Replication of Impala and Hive User Defined Functions (UDFs).
    • The user interface for creating replication schedules has been reorganized to present the configuration options on three tabs: General, Resources, and Advanced.
    • Uncheck Replicate Impala Metadata by default

      When creating a Hive replication schedule, the option Replicate Impala Metadata was checked (true) by default. In Cloudera Manager 5.9 and higher, the value is unchecked (false) by default.

    • YARN BDR enhancement

      YARN jobs now include the BDR schedule ID that launched the job so you can connect logs with existing schedules, if multiple schedules exist.

  • Resource Management
    • Custom Cluster Utilization Reports

      Documentation has been added to create custom Cluster Utilization reports that you can export data from. See Creating a Custom Cluster Utilization Report.

    • New settings for continuous scheduling

      For new installs, default values for configurations have been changed. yarn_scheduler_fair_continuous_scheduling_enabled is set to false. resourcemanager_fair_scheduler_assign_multiple is set to 'true'. Existing settings are preserved when you upgrade from a lower version.

    • YARN historical reports by user show pool-user entity

      When Cloudera Manager manages multiple clusters, there is no per user tracking for historical applications and queries across clusters. Instead, Historical Applications by User and Historical Queries by User show applications and queries per user and pool. (A pool is associated with a specific cluster.)

    • Directory Usage Report needs export capability

      Directory usage reports can be exported as a CSV file.

  • Cloudera Manager Admin Console User Interface
    • Service colors

      A new set of colors is used to represent each kind of service.

    • Move the table sorting icon to the right

      The table sorting icon now appears consistently on the right hand side of each column.

    • Improved Configuration Diff Display

      Changes displayed in the configuration history page are much more user friendly. For a large section of changed text, Cloudera Manager generates a diff between the old and the new and displays the diff.

      When a user changes only the password, Cloudera Manager does not show the delta: both the old and the new passwords are masked out before the comparison is performed.

    • Move actions menu to the top header

      The actions menu now appears next to the entity title.

    • Move Federation and High Availability to a separate page

      The Federation and High Availability sections used to appear on the HDFS Instances page of an HDFS service. They have been moved to a new page called Federation and High Availability. There is a link from the existing Instances page to this new page.

    • Remove repeated heading below the second level navigation

      Subtitles below the second level navigation tabs are removed because they repeated the content in the tabs.

    • Move maintenance mode and badges to the title area

      Maintenance mode, staleness badges now appear next to the title of the entity.

    • Express wizard allows you to add Kafka

      Kafka is now listed in the custom services when you click the Add Cluster button.

  • Cloudera Manager API
    • Add update_user to Python API client

      Added the update_user() method to the Python API client api_client.py.

    • Expose API endpoint to add a specific path

      New API endpoints have been added that allow users to add, list and remove Watched Directories in HDFS service.

  • Logging
    • Include host in log file name

      Kafka log4j log files now include the host name in the format kafka-broker-${host}.log. Similarly, MirrorMaker logs now include the host name in the format kafka-mirrormaker-${host}.log. Due to the log file name change, when you upgrade Cloudera Manager it no longer recognizes your old log files in log search, though they are still present on disk.

    • Configuration changes to Cloudera Manager audit log

      Cloudera Manager displays the History and Rollback support for the Cloudera Manager Settings. (Administration > Settings). This helps you to track the changes made by an administrator so that Cloudera Support can provide better service when certain Cloudera Manager administrative settings are modified.

  • Diagnostic Bundles
    • Show the Diagnostic Bundle Redaction Policy using the redaction config

      You can specify what information should be redacted in the diagnostic bundle in the UI using Administration > Settings > Redaction Parameters for Diagnostic Bundles.

  • Upgrade
    • Report that a simple restart was performed if rolling restart could not be performed

      Informs you when a simple restart is performed instead of rolling restart on a service because rolling restart is not available.

  • Oozie
    • Provide dump / load functionality for Oozie DB

      The Actions menu in the Oozie service has two new commands, Dump Database and Load Database. These commands make it easier to migrate an Oozie database to another database supported by Oozie. The Dump Database command exports Oozie's database to a file (configurable by Database Dump File setting). Load Database loads the file into a database.

    • Install Oozie ShareLib permissions change

      Install Oozie ShareLib Command assigns correct permissions to the uploaded libraries. This prevents breaking Oozie workflows with a custom umask setting.

  • Configuration Changes
    • Solr zkClientTimeout option

      Added the zkClientTimeout parameter for ZooKeeper.

    • Add JHIST compression as a configuration option

      Added a new option for setting the file format used by an ApplicationMaster when generating the .jhist file.

    • Enable heap dump by default for all daemons

      Starting in version 5.9, when you configure roles that are JVM based, the Dump Heap When Out of Memory configuration parameter defaults to true. An upgrade from a pre-5.9 version maintains your pre-5.9 settings.

    • Cloudera Manager support for client-side YARN graceful decommissioning

      Adds the ability to perform a graceful decommission on YARN NodeManager roles whereby the Node Manager is not assigned new containers, and waits for any currently running applications to finish before being decommissioned unless a timeout occurs. You can configure the timeout using the Node Manager Graceful Decommission Timeout configuration property in the YARN Service. The default behavior has not changed, and continues to be a non-graceful decommission. Affects Cloudera Manager 5.9.0 and higher, and CDH 5.9.0 and higher.

    • Deploy Client Configuration command details page now shows stdout/stderr

      stdout and stderr log links are now shown in the UI when there is a failure while deploying client configurations.

    • Make EXTRA_RATIO configurable for Headlamp indexing

      Added the configuration parameter, Extra Space Ratio for Indexing, to Reports Manager. You can use the parameter to make the speed of indexing faster by allocating additional memory.

    • Configure HBase Indexer to wait longer for ZooKeeper to come up

      The default amount of time that HBase Indexer roles attempts to connect to ZooKeeper has been increased from 30 to 60 seconds. This default can be adjusted by setting a new Cloudera Manager configuration parameter, HBase Indexer ZooKeeper Session Timeout.

  • Embedded database mode improvements

    In version 5.9 and higher, Cloudera Manager can clearly identify whether or not a customer is using the embedded PostgreSQL database. Cloudera does not recommend the embedded database for production use, and requests that customers deploy production systems using an external database. The diagnostic bundles now contain information about whether or not a customer is using the embedded PostgreSQL database. Support can then reach out to customers accordingly.

    If Cloudera Manager is configured to use the embedded PostgreSQL database, a yellow banner appears in the UI recommending that you upgrade to a supported external database.

  • Fix CatalogServiceClient to handle TLS connections to catalogd for UDF replication

    When Impala uses SSL, we now support TLS Connection to Catalog Server. Customers can enable replication for any Impala UDFs/Metadata (in Hive Replication) in Cloudera Manager 5.9 and higher.

  • Do not show steps that are unreachable (skipped)

    When running wizards from the Cloudera Manager Admin Console that add a cluster, add a service, perform an upgrade, and other tasks, steps do not display when they are not reachable or do not apply to the current configuration.

  • Improve Cloudera Manager provisioning performance on AWS

    Add support for resetting Cloudera Manager GUID/UUID. This is accomplished by checking the UUID file.

    If Cloudera Manager finds the UUID file (/etc/cloudera-scm-server/uuid) and the UUID is different than the GUID in the cm_version table, it updates the GUID in the cm_version table with the contents of the UUID file and removes the UUID file.

What's New in Cloudera Manager 5.8.5

Cloudera Manager 5.8.5 is a maintenance release with many fixed issues. See Issues Fixed in Cloudera Manager 5.8.5.

What's New in Cloudera Manager 5.8.4

Cloudera Manager 5.8.4 is a maintenance release with many fixed issues. See Issues Fixed in Cloudera Manager 5.8.4.

What's New in Cloudera Manager 5.8.3

Cloudera Manager 5.8.3 is a maintenance release with many fixed issues. See Issues Fixed in Cloudera Manager 5.8.3.

What's New in Cloudera Manager 5.8.2

Cloudera Manager 5.8.2 is a maintenance release with many fixed issues. See Issues Fixed in Cloudera Manager 5.8.2.

What's New in Cloudera Manager 5.8.1

An issue has been fixed. See Issues Fixed in Cloudera Manager 5.8.1.

What's New in Cloudera Manager 5.8.0

  • Operating Systems - Support for Debian 8.2.
  • Resource management and utilization - Added support for nesting dynamic resource pools within a named pool at runtime.
  • Backup and Disaster Recovery
    • The Replication Schedules page now has a search function for finding scheduled replications.
  • You can now specify a start and end time for the events that are included in manually-triggered diagnostic bundles. See Manually Triggering Collection and Transfer of Diagnostic Data to Cloudera.
  • Impala
    • Impala adds a new configuration option, Use HDFS Rules to Map Kerberos Principals to Short Names. Enabling this option makes Impala pickup hadoop.security.auth_to_local configuration from HDFS configurations and uses it for Kerberos principal-to-short-name translation. This only applies for Cloudera Manager 5.8.0 and higher and CDH 5.8.0 and higher. It only affects deployments where Impala is set up to use Kerberos as the authentication mechanism. It defaults to false, to preserve the behavior from earlier CDH versions. This has no impact on upgrade.

    • Enable Impala Admission Control and Enable Dynamic Resource Pools are now enabled by default. Customized configuration values are preserved during the upgrade.

    • Impala Admission Control now supports a global method for editing the Access Control List.

  • EMC Isilon
    • Kerberos is now fully supported for replications between clusters using Isilon storage. You must configure a custom principal.
  • Security
    • Active Directory KDC
    • Redaction: In the Cloudera Manager Admin Console, Advanced Configuration Snippet parameters will now be redacted to block sensitive information such as passwords or secret keys.
    • Sentry
      • Cloudera Search adds support for storing permissions in the Sentry service. You can enable storing permissions in the Sentry service by Enabling the Sentry Service for Solr. If you have already configured Sentry's policy file-based approach, you can migrate existing authorization settings as described in Migrating from Sentry Policy Files to the Sentry Service. solrctl has been extended to support:
        • Migrating existing policy files to the Sentry service
        • Managing managing permissions in the Sentry service
      • Sentry supports data stored on Amazon S3 and can secure URIs with an S3 schema.
  • YARN
    • YARN Allowed System Users now includes hbase by default. This is helpful when running certain tools for HBase that need to execute MapReduce jobs.

What's New in Cloudera Manager 5.7.6

A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.7.6.

What's New in Cloudera Manager 5.7.2

A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.7.2.

What's New in Cloudera Manager 5.7.1

A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.7.1.

What's New in Cloudera Manager 5.7.0

  • Operating Systems - Support for:
    • RHEL/CentOS 6.6, 6.7, 7.1, and 7.2
    • Oracle Enterprise Linux (OEL) 7.1 and 7.2
    • SUSE Linux Enterprise Server (SLES) 11 with Service Packs 2, 3, 4
    • Debian: Wheezy 7.0, 7.1, and 7.8
  • Resource management and utilization
    • Simplified and expanded resource management. The screens for YARN and Impala dynamic resource pools are now managed separately. See Dynamic Resource Pools.
    • Resource pools now support the allowPreemptionFrom, minSharePreemptionTimeout, and fairSharePreemptionTimeout attributes. See Enabling and Disabling Fair Scheduler Preemption and Configuring the Fair Scheduler.
    • Cluster utilization reports track usage of resources allocated using dynamic resource pools. See Cluster Utilization Reports.
    • The new Directory Usage Report now shows aggregated usage information, including quotas and file sizes, which are sortable. You can also perform multiple actions on filesystem objects. See Directory Usage Report.
    • Two new predicates have been added to the tsquery language: day in and hour in. These allow you to limit streams to specified days of the week and specified hours of each day, respectively. See Filtering by Day of Week or Hour of Day.
  • Extensibility - For more information, see Cloudera Manager Extensions.
    • Parcels are typed according to the OS version. The parcel extension indicates the version. The library for developing external parcels now supports an extension for RHEL 7.
    • A new environment variable, ZK_PRINCIPAL_NAME, is now defined for CSD processes when ZooKeeper is Kerberized and has a custom principal.
    • A new flag, jvmBased, is now available to CSD authors to indicate that a CSD role is JVM-based. This flag enables a set of JVM-related features in Cloudera Manager—for example, the ability to automatically generate a heap dump when an Out Of Memory error occurs.
  • API
    • Cloudera Manager now attempts to gracefully handle overlapping API calls.
    • All distributed filesystem services (such as HDFS or Isilon) installed in a cluster can now be enumerated using the API.
    • You can export the complete configuration of a CDH cluster managed by Cloudera Manager as a template, modify the template, and import the template to create a new cluster. See Creating a CDH Cluster Using a Cloudera Manager Template.
  • The advanced configuration snippet editor now allows you to edit properties as name/value pairs. This is the default, however you can also choose to edit the snippet as XML.
  • HBase now includes metrics and charts for replication. These charts are available in the Chart Library for each RegionServer.
  • When you click the Role Log link when viewing a role, the log is opened at the current timestamp, rather than the top of the log file. This enables you to see the relevant log messages when investigating an event that occurred recently.
  • A new tsquery function called counter_delta has been added to accurately compute the difference between consecutive data points for counter metrics.
  • The distcp utility now supports setting records per chunk, using the distcp.dynamic.recordsPerChunk in an advanced configuration snippet to set the number of records (paths) in each chunk. When a value is set for distcp.dynamic.recordsPerChunk, other related settings, such as the maximum number of chunks tolerable, the ideal number of chunks, and the split ratio, are ignored.
  • A warning is shown when upgrading with incompatible versions of Kafka and CDH. The Kafka client libraries bundled in CDH cannot communicate with an older Kafka server.
  • You can override the sudo commands that Cloudera Manager agent uses by redirecting the sudo commands to a script that you write to allow or disallow certain actions. See Overriding the sudo Command.
  • Hive
    • Hive on Spark is now supported.
    • The default execution engine for Hive can now be configured, which makes it easy to run all Hive jobs on Spark.
    • HiveServer2 now has a Web UI. See Using HiveServer2 Web UI in CDH.
    • Hive and HDFS replication source/target listings now work with Isilon.
    • A dialog box now displays when scheduling Hive replications, reminding you to take snapshots of the Hive warehouse directory. See Hive/Impala Replication with Snapshots.
    • The Direct SQL option is now enabled in Hive Metastore.
    • When upgrading to CDH 5.7, if Hive is configured to use YARN, all Hive on Spark parameters are automatically tuned to recommended values. If Hive on Spark was previously tuned, this is skipped.

A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.7.0.

What's New in Cloudera Manager 5.6.1

A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.6.1.

What's New in Cloudera Manager 5.6.0

What's New in Cloudera Manager 5.5.4

A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.5.4.

What's New in Cloudera Manager 5.5.3

An issue has been fixed. See Issues Fixed in Cloudera Manager 5.5.3.

What's New in Cloudera Manager 5.5.2

  • New Impala flags added for web server certificate files and passwords. This adds support for the --webserver_private_key_file and --webserver_private_key_password_cmd flags for the Impala Daemon, the Impala Catalog Server, and the Impala StateStore roles.

A number of issues have also been fixed. See Issues Fixed in Cloudera Manager 5.5.2.

What's New in Cloudera Manager 5.5.1

An issue has been fixed. See Issues Fixed in Cloudera Manager 5.5.1.

What's New in Cloudera Manager 5.5.0

  • Operating Systems - Support for RHEL/CentOS 6.6 (in SE Linux mode), 6.7, and 7.1, and Oracle Enterprise Linux 7.1.
  • Databases - Supports MariaDB 5.5, Oracle 12c, and PostgreSQL 9.4.
  • Selective service restart after activating parcels is supported.
  • Retrying upgrade actions is supported. If a cluster upgrade command fails while in progress, you can retry a command after fixing the cause of failure. On retry, the command restarts from the command step where it failed.
  • The command details page for running and recent commands has been redesigned for usability and scalability.
  • Instead of serially starting all services for the first time, services that are not dependent are started in parallel. This decreases the time required to start services for the first time after creating a cluster.
  • Performance has improved for service startup, client configuration deployment, and calculation of stale configurations.
  • Suppression of notifications

    Suppression can be useful if a warning does not apply to your deployment and you no longer want to see the notification. Suppressed warnings are still retained by Cloudera Manager, and you can unsuppress the warnings at any time.

  • Multi Cloudera Manager Dashboard - A special mode of Cloudera Manager that enables you to view monitoring data aggregated from multiple Cloudera Manager instances that manage one or more CDH clusters. See Monitoring Multiple CDH Deployments Using the Multi Cloudera Manager Dashboard.
  • You can decommission roles when services are completely stopped. This allows you to decommission hosts during cluster downtime.
  • You can disable collection of certain domain metrics—for example, for HBase RegionServers, Kafka Brokers, and others—through new settings in the host advanced configuration snippet. This is useful in certain support situations and should only be done under the direction of Cloudera Support.
  • You can configure which aggregate metrics are automatically generated. This advanced feature can be useful in certain situations to impact the monitoring workload, allowing unused or less-important aggregate metrics to be skipped. This may result in improved performance and the ability to handle larger monitoring workloads, or to retain data for a larger workload for longer. Cloudera recommends using this only under the direction of Cloudera Support.
  • Alert Publisher can be configured to pass alert events to a user-defined script. Use this for integrating with other alerting systems or for custom logic (for example, to send some alerts to some people and others to other people).
  • Agent minor version mismatches (5.4 to 5.5) now cause bad host health. Maintenance version mismatches (for example, 5.4.x to 5.4.y) still cause concerning host health.
  • Cloudera Manager indicates if the Java version in use is too old.
  • Cloudera Manager indicates if the supervisor component of the Agent needs to be restarted after an upgrade.
  • Full and User Administrators can view active user sessions. See Viewing User Sessions.
  • Full Administrators and Auditors can audit failed and successful logins.
  • Multiple user session logins can be disallowed.
  • You can configure external authentication so that local administrator emergency access is disabled. This means that no local accounts can log in under any circumstances, including when the external system is not functioning.
  • You can turn on authentication for the URLs for downloading client configuration zip files. Previously, authentication was never required.
  • Passwords are no longer accessible in cleartext through the Cloudera Manager UI or in the configuration files stored on disk. See Cloudera Manager and Passwords. There are some exceptions; see Known Issues and Workarounds in Cloudera Manager 5.
  • HBase
    • Use a configuration option in HBase to skip region reload during rolling restart and rolling upgrade, to increase the speed of the operations.
    • HBase rolling restart performance can be improved by increasing the number of Region Mover Threads. If the value of this property is 1, it can lower rolling restart speed. The Admin Console now displays this information and, if the value is 1, advises increasing it.
    • HBase Thrift Server and Rest Server support TLS/SSL.
  • HDFS
  • Hive
    • Hive can use TLS/SSL and Kerberos at the same time.
    • When Hive is configured to use TLS/SSL, Hue is automatically configured to use that protocol when communicating with Hive. Similarly, when Impala is configured to use TLS/SSL, Hue is automatically configured to use that protocol when communicating with Impala.
    • HiveServer2 supports a timeout value for idle sessions and operations. By default, it times out client sessions after a week and idle operations after three days. This helps alleviate problems with long-running sessions when using Hue.
    • Cloudera Manager collects and displays various operational metrics for Hive.
  • Hue
    • Hue supports a Load Balancer role using HTTPD as a load balancer.
    • You can configure certificates trusted by Hue using the TLS/SSL Truststore configuration. This replaces the REQUESTS_CA_BUNDLE advanced configuration snippet entry.
    • You can specify a password that protects the Hue private key file.
    • Cloudera Manager collects and displays various operational metrics for Hue. New health tests have been added for Hue as well.
  • Impala supports TLS/SSL internally between the StateStore and the Catalog Server roles as well as Impala Daemon.
  • Kafka
    • Kafka supports rolling restart.
    • Kafka displays additional broker metrics.
    • Kafka exposes additional commonly configured parameters.
    • Existing Kafka parameter definitions have updated descriptions, default values, and validation settings.
    • The Kafka broker instance list now shows which broker is the active controller.
  • Key Trustee
    • The Key Trustee Server CSD is included in Cloudera Manager. Manual installation of the Key Trustee Server CSD is not required.
    • A Key Administrator role in Cloudera Manager is used for configuring HDFS Data at Rest Encryption. Only a Key Administrator and a Full Administrator can make configuration changes to Java Keystore KMS, Key Trustee KMS, and Key Trustee Server. Configuring HDFS to use Data at Rest Encryption is also limited to the Key Administrator and Full Administrator roles. This allows organizations to keep Key Administrators and Cluster Administrators separate, which is a security best practice.
    • When running Key Trustee KMS in a highly available configuration, Cloudera Manager can automatically generate the load balancer URL.
  • Sentry
    • Sentry introduces column-level access control for tables in Hive and Impala. Previously, Sentry supported privilege granularity only at the table level. You can now assign the SELECT privilege on a subset of columns in a table. See Hive SQL Syntax for Use with Sentry.
    • Sentry supports Kerberos authentication for the Sentry web server. See, Using the Sentry Web Server.
  • Solr
    • Solr can be configured with a load balancer in a secure environment.
    • There is a new Solr Max Connector Threads property for Solr Server in CDH 5.1.0 and higher.
    • Solr supports LDAP/AD authentication.
  • Backup and Disaster Recovery
    • The user interface for scheduling and reviewing replications and snapshots has been improved. You can now view the history of replication jobs and subtasks more easily. See Viewing Replication History.
    • When specifying an HDFS replication job, you can apply exclusion filters to exclude specific files or directories. See Configuring Replication of HDFS Data.
    • You can download or send to Cloudera Support a diagnostic bundle to troubleshoot replication jobs. Bundles include logs of the replication run. See Viewing Replication Schedules.
    • The performance of the file-listing phase of a replication job has been improved.
    • The performance of the initialization and running phase has been improved.
    • The following advanced configuration snippets for configuring replications have been added:
      • HDFS Replication Advanced Configuration Snippet (Safety Valve) for hadoop-env.sh
      • Hive Replication Advanced Configuration Snippet (Safety Valve) for hive-site.xml
      • HDFS Replication Advanced Configuration Snippet (Safety Valve) for yarn-site.xml
      • HDFS Replication Advanced Configuration Snippet (Safety Valve) for mapred-site.xml
    • Snapshot properties for HBase such as thread pool size can be configured in the HBase Client Advanced Configuration Snippet (Safety Valve) for hbase-site.xml property.
    • Hive partitions are chunked during export and import to avoid message size limitations.
    • Hive replications validate metadata on the destination Hive Metastore before copying HDFS data from the source to avoid copying errors during replication.
    • The use of snapshots to improve replications is documented. See Using Snapshots with Replication.
    • The effect of network latency on replications is documented. See Network Latency and Replication.
    • Scheduled snapshots can be disabled and re-enabled.
    • API improvements:
      • Explicit support for pausing snapshot policies
      • Failed file listing
      • Collection of diagnostic bundles for replication schedules and history

What's New in Cloudera Manager 5.4.10

A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.4.10.

What's New in Cloudera Manager 5.4.9

A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.4.9.

What's New in Cloudera Manager 5.4.8

New ability to decommission hosts with stopped services

Adds ability to decommission roles when services are completely stopped. This allows users to decommission hosts during cluster downtime.

A number of issues have also been fixed. See Issues Fixed in Cloudera Manager 5.4.8.

What's New in Cloudera Manager 5.4.7

New service-level advanced configuration snippets for Solr

The following new properties were added:
  • Solr Service Advanced Configuration Snippet (Safety Valve) for core-site.xml
  • Solr Service Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml

A number of issues have also been fixed. See Issues Fixed in Cloudera Manager 5.4.7.

What's New in Cloudera Manager 5.4.6

An issue has been fixed. See Issues Fixed in Cloudera Manager 5.4.6.

What's New in Cloudera Manager 5.4.5

A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.4.5.

What's New in Cloudera Manager 5.4.3

Rollback for CDH 4 to CDH 5 Upgrades

A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.4.3.

What's New in Cloudera Manager 5.4.1

Hue HA Improvements
  • The Cloudera Manager Express and Add Service wizards allow you to add a Hue service with multiple Hue Server roles. For Kerberized clusters, the Add Service wizard automatically adds a colocated Kerberos Ticket Renewer role for each Hue Server role instance.
  • When Kerberos is enabled, Cloudera Manager now checks to ensure each Hue Server role is colocated with a Kerberos Ticket Renewer role. If you forget to add a Kerberos Ticket Renewer role when adding a new Hue Server role, a configuration error is generated.
Cloudera Manager High Availability

A number of issues have also been fixed. See Issues Fixed in Cloudera Manager 5.4.1.

What's New in Cloudera Manager 5.4.0

  • OS - Added support for RHEL 6.6 and CentOS 6.6.
  • Cloudera Manager prevents installing or upgrading to a CDH version that is too new for the Cloudera Manager version. When using parcels, it prevents parcel installation. When using packages, it prevents creating services.
  • Installation and add service wizards now support the Oozie database.
  • New wizard for NameNode, Failover Controller, and JournalNode role migration.
  • Parcel page layout redesigned in terms of layout, performance and ease of use. A new parcel per host detail view is added.
  • Configuration
    • Configuration pages use the new layout by default. The new layout is dramatically improved in terms of layout, performance, and ease of use. The existing layout is accessible via the Switch to the classic layout link.
    • New configuration actions:
      • Configuration can now be applied to all clusters as well as for a specific cluster.
      • Several new configuration views have been added to show all non-default values across all clusters and the Cloudera Management Service, as well as differences across all clusters and multiple services of the same type.
      • One-click differences in configuration settings for a specific service across multiple clusters.
  • Support
    • Include a Cloudera support ticket with YARN application support bundles.
    • Reduce the size of support bundles by specifying log data of interest to include in the bundle.
  • HDFS
    • Support for HDFS DataNode hot swap.
    • Option to include replication of extended attributes during HDFS replication. HDFS ACLs will now be replicated along with permissions.
  • Added support for Hive on Spark. For more information, see Running Apache Hive on Spark in CDH.
  • Security
    • Secure impersonation support for the Hue HBase app.
    • Redaction of sensitive data in log files and in SQL query history.
    • Support for custom Kerberos principals.
    • Added commands for regenerating Kerberos keytabs at service and host levels. These commands will clear existing keytabs from affected role instances and then trigger the Generate Credentials command to create new keytabs.
    • Kerberos support for Sqoop 2.
    • Kerberos and TLS/SSL support for Flume Thrift Source and Sink.
    • Solr TLS/SSL support.
    • Navigator Key Trustee Server can be installed and monitored by Cloudera Manager.
    • HBase Indexer integration with Sentry (File-based) for authorization.

What's New in Cloudera Manager 5.3.10

A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.3.10.

What's New in Cloudera Manager 5.3.9

A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.3.9.

What's New in Cloudera Manager 5.3.8

A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.3.8.

What's New In Cloudera Manager 5.3.7

An issue has been fixed. See Issues Fixed in Cloudera Manager 5.3.7.

What's New in Cloudera Manager 5.3.6

A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.3.6.

What's New in Cloudera Manager 5.3.4

A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.3.4.

What's New in Cloudera Manager 5.3.3

A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.3.3.

What's New in Cloudera Manager 5.3.2

A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.3.2.

What's New in Cloudera Manager 5.3.1

A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.3.1.

What's New in Cloudera Manager 5.3.0

  • JDK 1.8 - Cloudera Manager adds support for Oracle JDK 1.8.
  • Single user mode - The Cloudera Manager Agent and all service processes can now be run as a single configured user in environments where running as root is not permitted. See Configuring Single User Mode.
  • CDH upgrade wizard enhanced - The CDH upgrade wizard now supports minor and maintenance version upgrade as well as major version upgrade.
  • Oozie Sharelib - The Oozie Sharelib can be updated without restarting the Oozie service.
  • Read-only users prevented from viewing process logs or environment - Read-only users can no longer view the environment or logs of a process. This is to prevent read-only users from seeing potentially sensitive information.
  • New icons for the KMS and Key Trustee services.
  • Data-at-rest encryption HDFS encryption implements transparent, end-to-end encryption of data read from and written to HDFS by creating encryption zones. An encryption zone is a directory in HDFS with every file and subdirectory in it encrypted. Use one of the following services to store, manage, and access encryption zone keys:
    • KMS (File) - The Hadoop Key Management Server with a file-based Java keystore; maintains a single copy of keys, using simple password-based protection.
    • KMS (Navigator Key Trustee) - An enterprise-grade key management service that replaces the file-based Java keystore and leverages the advanced key-management capabilities of Cloudera Navigator Key Trustee. Navigator Key Trustee is designed for secure, authenticated administration and cryptographically strong storage of keys on multiple redundant servers that can be located outside the cluster.
    For more information, see HDFS Transparent Encryption.
  • The Cloudera Manager Server now reports the correct number of physical cores and hyper-threading cores if hyper-threading is enabled.
  • Client configurations - Client configurations are now managed so that they are redeployed when a machine is re-imaged.
  • Configuration
    • NameNode configuration - The decommissioning parameters dfs.namenode.replication.max-streams and dfs.namenode.replication.max-streams-hard-limit are now available.
    • Hue debug options - Two service-level configuration parameters have been added to the Hue service to enable Django debug mode and debugging of internal server error responses.

What's New in Cloudera Manager 5.2.7

An issue has been fixed. See Issues Fixed in Cloudera Manager 5.2.7.

What's New in Cloudera Manager 5.2.6

A number of issues have been fixed, see Issues Fixed in Cloudera Manager 5.2.6.

What's New in Cloudera Manager 5.2.5

A number of issues have been fixed, see Issues Fixed in Cloudera Manager 5.2.5.

What's New in Cloudera Manager 5.2.4

There are no changes for Cloudera Manager 5.2.4. It was released to provide the Cloudera Navigator fix in .

What's New in Cloudera Manager 5.2.2

  • HDFS Decommissioning - The following decommissioning properties have been exposed in Cloudera Manager 5.2.2.
    • Maximum number of replication threads on a Datanode (dfs.namenode.replication.max-streams)
    • Hard limit on the number of replication threads on a Datanode (dfs.namenode.replication.max-streams-hard-limit)
  • New icons for the KMS and Key Trustee services.

What's New in Cloudera Manager 5.2.1

This release fixes the “POODLE” vulnerability and a number of other issues. See Issues Fixed in Cloudera Manager 5.2.1.
  • The YARN yarn.nodemanager.recovery.dir property can be configured.
  • A health check indicates whether the HDFS metadata upgrade has not been finalized.

What's New in Cloudera Manager 5.2.0

  • OS and database support - Adds support for Ubuntu Trusty (version 14.04) and PostgreSQL 9.3.
  • Services - the following new services have been added:
    • Isilon - supports the EMC Isilon distributed filesystem.
    • KMS - the Java keystore-based key management server.
    • Key Trustee - the enterprise-grade key management server using Cloudera Navigator Key Trustee.
    • Spark - running Spark applications on YARN. The existing Spark service has been renamed Spark (Standalone).
  • Accumulo - Kerberos authentication is now supported. If you have been using advanced configuration snippets (safety valves) to configure Kerberos with Accumulo, you may now remove those settings and have Cloudera Manager generate the principal and keytab file for you.
  • HDFS Data at Rest Encryption -
  • HBase - Support for configuring hedged reads has been added for HBase. The default configuration is to turn hedged reads off. Cloudera Manager will emit two properties, dfs.client.hedged.read.threadpool.size (default: 0) and dfs.client.hedged.read.threshold.millis (default: 500ms) to hbase-site.xml. For more information, see Hedged Reads .
  • ZooKeeper - the RMI port can be configured. The port is configured using the JDK7 flag -Dcom.sun.management.jmxremote.rmi.port. The default value is set to be same as the JMX Agent port. Also, a special value of 0 or -1 disables the setting and a random port is used. The configuration has no effect on versions lower than Oracle JDK 7u4.
  • Cloudera Manager Agent configuration
    • The supervisord port can now be configured in the Agent configuration supervisord_port. The change takes effect the next time supervisord is restarted (not simply when the Agent is restarted).
    • Added an Agent configuration local_filesystem_whitelist that allows configuring the list of local filesystems that should always be monitored.
  • Proxy user configuration
    • All services' proxy user configuration properties have been moved to the HDFS service. Other services running on the cluster inherit the configuration values provided in HDFS. If you have previously configured a service to have values different from those configured in HDFS, then the proxy user configuration properties will be moved to that service's Advanced Configuration Snippet (Safety Valve) for core-site.xml to retain existing behavior.

      Oozie and Solr are exceptions to this. Oozie proxy user configuration properties have been moved to Oozie Server Advanced Configuration Snippet (Safety Valve) for oozie-site.xml if they differ from HDFS. Solr proxy user configuration properties have been moved to Solr Service Environment Advanced Configuration Snippet (Safety Valve) if they differ from HDFS.

  • Resource management - YARN and Llama integrated resource management and Llama high availability wizard.
  • New and changed user roles - BDR Administrator, Cluster Administrator, Navigator Administrator, and User Administrator. The Administrator role has been renamed Full Administrator. See Cloudera Manager User Accounts.
  • Configuration UI
    • Cluster-wide configuration - you can view all modified settings and configure log directories, disk space thresholds, and port settings.
    • New configuration layout - the new layout provides an alternate way to view configuration pages. In the classic layout, pages are organized by role group and categories within the role groups. The new layout allows you to filter on configuration status, category, and scope. On each configuration page you can easily switch between the classic and new layout.

What's New in Cloudera Manager 5.1.6

A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.1.6.

What's New in Cloudera Manager 5.1.5

A number of issue have been fixed. See Fixed Issues in Cloudera Manager 5.1.5.

What's New in Cloudera Manager 5.1.4

A number of issues have been fixed. See Fixed Issues in Cloudera Manager 5.1.4.

What's New in Cloudera Manager 5.1.3

A number of issues have been fixed. See Fixed Issues in Cloudera Manager 5.1.3.

  • JDK Installation
    • Users who are adding or upgrading hosts can now choose not to install the JDK that ships with Cloudera Manager.

What's New in Cloudera Manager 5.1.2

A number of issues have been fixed. See Fixed Issues in Cloudera Manager 5.1.2.

  • New SAML configuration option
    • You can now specify the binding protocol to be used for AuthNResponses sent from the IDP to Cloudera Manager. Previously, Cloudera Manager would only use HTTP-Artifact, but it is now possible to choose HTTP-Post. HTTP-Artifact remains the default binding.

What's New in Cloudera Manager 5.1.1

An issue has been fixed. See Issues Fixed in Cloudera Manager 5.1.1.

What's New in Cloudera Manager 5.1.0

  • SSL Encryption
    • Supports several new SSL-related configuration parameters for HDFS, MapReduce, YARN and HBase, which allow you to configure and enable encrypted shuffle and encrypted web UIs for these services. See Configuring TLS/SSL Encryption for CDH Services.
    • Cloudera Manager now also supports the monitoring of HDFS, MapReduce, YARN, and HBase when SSL is enabled for these services. New configuration parameters allow you to specify the location and password of the truststore used to verify certificates in HTTPS communication with CDH services and the Cloudera Manager Server.
  • Sentry Service
    • A new Sentry service that stores the authorization metadata in an underlying relational database and allows you to use Grant/Revoke statements to modify privileges. See The Sentry Service.
    • You can also configure the Sentry service to allow Pig, MapReduce, and WebHCat queries access to Sentry-secured data stored in Hive. See Configuring Pig and HCatalog for the Sentry Service.
  • Kerberos Authentication
    • Now supports a Kerberos cluster using an Active Directory KDC.
    • New wizard to enable Kerberos on an existing cluster. The wizard works with both MIT KDC and Active Directory KDC.
    • Ability to configure and deploy Kerberos client configuration (krb5.conf) on a cluster.
  • Spark Service - added the History Server role
  • Impala - added support for Llama ApplicationMaster High Availability
  • User Roles - there are two new roles: Operator and Configurator that support fine-grained access to Cloudera Manager features. See Cloudera Manager User Accounts.
  • Monitoring
    • Updates to Oozie monitoring
    • New Hive metastore canary
  • UI - The UI has been updated to improve scalability. The Home > Status tab can be configured to display clusters in a full or summary format. There is a new Cluster page for each cluster. The Hosts and Instances pages have added faceted filters.

What's New In Cloudera Manager 5.0.7

A number of issues have been fixed. See Fixed Issues in Cloudera Manager 5.0.7.

What's New in Cloudera Manager 5.0.6

A number of issues have been fixed. See Fixed Issues in Cloudera Manager 5.0.6.

What's New in Cloudera Manager 5.0.5

A number of issues have been fixed. See Fixed Issues in Cloudera Manager 5.0.5.

What's New in Cloudera Manager 5.0.2

A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.0.2.

What's New in Cloudera Manager 5.0.1

A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.0.1.

  • Monitoring
    • The Java Garbage Collection Duration health test for the Service Monitor, Host Monitor, and Activity Monitor has been replaced with the new Java Pause Duration health test.

What's New in Cloudera Manager 5.0.0

  • Service and Configuration Management
    • HDFS - cache management
  • Resource Management - Impala admission control
  • Monitoring
    • Host disks overview
    • Impala best practices
    • HBase table statistics
    • HDFS cache statistics

What's New in Cloudera Manager 5.0.0 Beta 2

  • Service and Configuration Management
    • HDFS
      • HDFS NFS Gateway role
      • Supports restoration of HDFS data from a snapshot
    • YARN
      • YARN Resource Manager High Availability
      • Resource pool scheduler
    • Support for Spark service
    • Support for Accumulo service
    • Support for service extensibility
    • Support to set up Oozie server High Availability
    • Granular configuration staleness UI
    • Support for setting maximum file descriptors
  • Monitoring
    • Support for monitoring the Cloudera Search/Solr service
    • New "failed" and "killed" badges displayed for unsuccessful YARN applications
    • More attributes available for filtering displays of YARN applications and Impala queries
    • New operational reports added for HBase tables and namespaces, Impala queries, and YARN applications
    • Support for creating user-defined triggers for metrics accessible via charts/tsquery
    • Charting improvements
      • New table chart type
      • New options for displaying data and metadata from charts
      • Support for exporting data from charts to CSV or JSON files
  • Administrative Settings
    • Added a new role type with limited administrator capabilities.
    • Cloudera Manager Server and all JVMs will create a heap dump if they run out of memory.
    • Configure the location of the parcel directory and specify whether and when to remove old parcels from cluster hosts.

What's New in Cloudera Manager 5.0.0 Beta 1

  • CDH Version
    • Supports both CDH 4 and CDH 5
    • CDH 4 to CDH 5 upgrade wizard
    • Support for YARN as a production execution environment
      • MapReduce (MRv1) to YARN (MRv2) configuration import
      • YARN-based resource management for Impala 1.2
  • JDK Version - Cloudera Manager 5 supports and installs both JDK 6 and JDK 7.
  • Resource Management
    • Static and dynamic partitioning of resources: provides a wizard for configuring static partitioning of resources (cgroups) across core services (HBase, HDFS, MapReduce, Solr, YARN) and dynamic allocation of resources for YARN and Impala.
    • Pool, resource group, and queue administration for YARN and Impala.
    • Usage monitoring and trending.
  • Monitoring
    • YARN service monitoring
    • YARN (MRv2) job monitoring
    • Configurable histograms of Impala query and YARN job attributes that can be used to quickly filter query and application lists
    • Scalable back-end database for monitoring metrics
    • Charting improvements
      • New chart types: histogram and heatmap
      • New scale types: logarithmic and power
      • Updates to tsquery language: new attribute values to support YARN and new functions to support new chart types
  • Extensibility
    • Ability to manage both ISV applications and non-CDH services (for example, Accumulo, Spark, and so on)
    • Working with select ISVs as part of Beta 1
  • Single Sign-On - Support for SAML to enable single sign-on
  • Parcels
    • Dependency enforcement to ensure incompatible parcels are not used together
    • Option to not cache downloaded parcels, to save disk space
    • Improved error reporting for management operations
  • Backup and Disaster Recovery (BDR)
    • HBase and HDFS snapshots: Supports scheduling snapshots on a recurring basis.
    • Support for YARN (MRv2): Replication jobs can now run using YARN (MRv2) instead of MRv1.
    • Global replication page: All scheduled snapshots (HDFS and HBase) and replication jobs for either HDFS or Hive are shown on a single Replications page.
  • Other
    • Global Search box
    • Several usability improvements
    • Comprehensive detection of configuration changes that require service restarts, refresh and redeployment of client configurations

Incompatible Changes in Cloudera Manager 5

Incompatible Changes Introduced in Cloudera Manager 5.5.0

  • Cloudera Manager no longer supports JDK 1.6.

Incompatible Changes Introduced in Cloudera Manager 5.4.0

  • The Blacklisted Products property has been removed from the Hosts > Parcels configuration.

Incompatible Changes Introduced in Cloudera Manager 5.3.0

  • Oozie metrics - The Oozie metrics framework is now controlled by the Enable The Metrics Instrumentation Service flag, which is enabled by default. When enabled, the old 'instrumentation' REST end-point is disabled and metrics are available on the new 'metrics' REST end-point (hostname:port/v2/admin/metrics).

Incompatible Changes Introduced in Cloudera Manager 5.2.0

  • Due to various internal changes to configuration generation, all service and client configurations will be stale after upgrade. To propagate the updates, restart the cluster and redeploy client configurations.

Incompatible Changes Introduced in Cloudera Manager 5.1.0

  • The Limited Administrator role has been renamed Limited Operator. The Limited Operator role is no longer available in Cloudera Manager Express. If you upgrade a Cloudera Manager Express installation, users in the Limited Operator role will not be able to log in. A user in the Administrator role must assign the Read-Only or Administrator role to those users.

Incompatible Changes Introduced in Cloudera Manager 5.0.0

  • Cloudera Manager API
    • New upgradeCdh command, which upgrades CDH cluster versions. Use this command to upgrade clusters from CDH 4 to CDH 5. The upgradeServices command previously used to upgrade CDH cluster versions is no longer supported.
    • The hostId field now contains a unique UUID and no longer matches the hostName field. When referring to a host, both hostId and hostName are accepted. However, any API clients that were previously cross-referencing host records with external information by hostName, but were using the hostId field in the API, must be updated to use the hostName field. Clients updated in this manner will function correctly with older versions of Cloudera Manager because the hostName field has always been present.
    • The clusterName field displayed when viewing service and role references is now an internal name and may not match the external displayNamefield of the cluster.
  • All CDH 5 versions of Hue work only with the default system Python version of the operating system it is being installed on. For example, on RHEL/CentOS 6, you need Python 2.6 to start Hue.
  • Cloudera Manager 5.0 includes a change to the value of the snmpTrapOID. Earlier releases set the value of snmpTrapOID (OID: .1.3.6.1.6.3.1.1.4.1.0) wrongly to clouderaManagerMIBNotifications (OID .1.3.6.1.4.1.38374.1.1.1). This is fixed in Cloudera Manager 5.0 with the correct value, which is clouderaManagerAlert (OID .1.3.6.1.4.1.38374.1.1.1.1). This change will break SNMP server setups that are configured to expect clouderaManagerMIBNotifications. Cloudera Manager administrators should configure their SNMP receivers to accept the corrected OID.
  • The default values for the following configurations have changed to include the JVM option -Djava.net.preferIPv4Stack=true, which sets the preferred protocol stack to IPv4 on dual-stack machines. Any values set to the old defaults will automatically be changed to the new default when upgrading to Cloudera Manager 5.
    • MapReduce client configuration:
      • hadoop-env.sh: added to HADOOP_CLIENT_OPTS
      • mapred-site.xml: added to mapred.child.java.opts
    • YARN client configuration:
      • hadoop-env.sh: added to YARN_OPTS
      • mapred-site.xml: added to yarn.app.mapreduce.am.command-opts, mapreduce.map.java.opts, and mapreduce.reduce.java.opts
    • HDFS client configuration: hadoop-env.sh: added to HADOOP_CLIENT_OPTS
    • Hive client configuration: hive-env.sh: added to HADOOP_CLIENT_OPTS
  • MapReduce health tests have been removed:
    • Job failure
    • Map backlog
    • Reduce backlog
    • Map locality
    If needed, the test can be replaced with a trigger. For example:
    • Looks at all the jobs that completed in the last hour and if there are more than 10% of failed jobs, change the health of the service to concerning:
      IF (select (jobs_failed_rate * 3600) as jobs_failed,
      ((jobs_failed_rate + jobs_completed_rate + jobs_killed_rate) * 3600)
      as all_jobs where roleType=JOBTRACKER AND serviceName=$SERVICENAME
      and last(jobs_failed_rate / (jobs_failed_rate + jobs_completed_rate +
      jobs_killed_rate)) >= 10 ending at $END_TIME duration "PT3600S")
      DO health:concerning
    • If there are more than 50% maps waiting than total slots available, health goes concerning.
      IF (select waiting_maps / map_slots where roleType=JOBTRACKER and serviceName=$SERVICENAME
      and last(waiting_maps / map_slots) > 50)
      DO health:concerning
    • If there are more than 50% reduce waiting than total slots available, health goes concerning.
      IF (select waiting_reduces / reduce_slots where roleType=JOBTRACKER and serviceName=$SERVICENAME
      and last(waiting_reduces / reduce_slots) > 50)
      DO health:concerning
  • HDFS checkpointing metrics have been removed:
    • end_checkpoint_num_ops
    • end_checkpoint_avg_time
    • start_checkpoint_num_ops
    • start_checkpoint_avg_time

Incompatible Changes Introduced in Cloudera Manager 5.0.0 Beta 2

  • Impala releases earlier than 1.2.1 are no longer supported.
  • Some of the constants identifying health tests have changed. The following existed in Cloudera Manager 4:
    • FAILOVERCONTROLLER_FILE_DESCRIPTOR
    • FAILOVERCONTROLLER_HOST_HEALTH
    • FAILOVERCONTROLLER_LOG_DIRECTORY_FREE_SPACE
    • FAILOVERCONTROLLER_SCM_HEALTH
    • FAILOVERCONTROLLER_UNEXPECTED_EXITS

    They are now:

    • MAPREDUCE_FAILOVERCONTROLLER_FILE_DESCRIPTOR
    • MAPREDUCE_FAILOVERCONTROLLER_HOST_HEALTH
    • MAPREDUCE_FAILOVERCONTROLLER_LOG_DIRECTORY_FREE_SPACE
    • MAPREDUCE_FAILOVERCONTROLLER_SCM_HEALTH
    • MAPREDUCE_FAILOVERCONTROLLER_UNEXPECTED_EXITS

    and

    • HDFS_FAILOVERCONTROLLER_FILE_DESCRIPTOR
    • HDFS_FAILOVERCONTROLLER_HOST_HEALTH
    • HDFS_FAILOVERCONTROLLER_LOG_DIRECTORY_FREE_SPACE
    • HDFS_FAILOVERCONTROLLER_SCM_HEALTH
    • HDFS_FAILOVERCONTROLLER_UNEXPECTED_EXITS

    The reason for the change is to better distinguish between MapReduce and HDFS failover controller monitoring in the health system.

Incompatible Changes Introduced in Cloudera Manager 5.0.0 Beta 1

  • Services
    • Impala - With Cloudera Manager 4.8 (released in late November 2013), only Impala 1.2.1 is supported, due to the introduction of the Impala Catalog Server. However, CDH 5.0.0 Beta 1 was released with Impala 1.2.0 (Beta). Therefore, if you upgrade from Cloudera Manager 4.8 (with Impala 1.2.1) to Cloudera Manager 5.0.0 Beta 1, and then upgrade your CDH to CDH 5.0.0 Beta 1, your version of Impala will be downgraded to Impala 1.2.0 from 1.2.1. This will result in some loss of functionality. See New Features in Impala for a list of the new features in Impala 1.2.1 that are not in Impala 1.2.0 (Beta).
    • Hive - HiveServer 2 is a mandatory role for Hive in CDH 5.
    • Hue - In CDH 5, Hue no longer has a Beeswax Server role. Hue now submits queries to HiveServer2.
    • HDFS - Cloudera Manager 5 does not support NFS-mounted shared edits directories for HDFS High Availability. It only supports the Quorum Journal method for shared edits. If you upgrade from Cloudera Manager 4 with a working CDH 4 High Availability configuration that uses NFS-mounted directories, your installation will continue to work until you disable High Availability. You will not be able to re-enable High Availability with NFS-mounted directories. Furthermore, you will not be able to upgrade to CDH 5 unless you disable High Availability, and you will need to use Quorum-based storage in order to re-enable High Availability after the upgrade.
    • YARN
      • The YARN (MRv2) configuration mapreduce.job.userlog.retain.hours has been replaced by yarn.log-aggregation.retain-seconds. Any existing value in mapreduce.job.userlog.retain.hours will be lost. However, this configuration never had any effect, so no functionality is affected.
      • The following configuration parameters were removed from YARN. These never had any effect, so no functionality is affected.
        • mapreduce.jobtracker.maxtasks.perjob
        • mapreduce.jobtracker.handler.count (non-functional duplicate of yarn.resourcemanager.resource-tracker.client.thread-count)
        • mapreduce.jobtracker.persist.jobstatus.active
        • mapreduce.jobtracker.persist.jobstatus.hours
        • mapreduce.job.jvm.numtasks
      • The following YARN configuration parameters were replaced. Only the YARN parameters were replaced. Old configurations will be lost, but they never had any effect so this does not affect functionality.
        • mapreduce.jobtracker.restart.recover replaced by yarn.resourcemanager.recovery.enabled (changed from Gateway to ResourceManager)
        • mapreduce.tasktracker.http.threads replaced by mapreduce.shuffle.max.connections
        • mapreduce.jobtracker.staging.root.dir replaced by yarn.app.mapreduce.am.staging-dir
      • Cloudera Manager 5 sets the default YARN Resource Scheduler to FairScheduler. If a cluster was previously running YARN with the FIFO scheduler, it will be changed to FairScheduler the next time YARN restarts. The FairScheduler is only supported with CDH 4.2.1 and later, and older clusters may hit failures and need to manually change the scheduler to FIFO or CapacityScheduler. See the Known Issues section of this Release Note for information on how to change the scheduler back to FIFO or CapacityScheduler.

Changed Features and Behaviors in Cloudera Manager 5

What's Changed in Cloudera Manager 5.13

  • External Accounts

    The AWS Credentials and Altus Credentials under Administration have been consolidated. You can find the pages for the credentials by selecting Administration > External Accounts.

  • RPC Wait Behavior

    When starting or restarting HDFS, Cloudera Manager now waits for any NameNode daemons that are started to begin responding to RPC requests before considering the start/restart operation to be complete. Previously, this wait was only performed in certain workflows; now, it is always done when starting HDFS.

  • Hue Load Balancer Enabled by Default

    For better Hue web site performance, the Hue Load Balancer is now enabled by default. In a secure cluster, Hue Load Balancer (apache httpd 2.4) requires "use_x_forwarded_host" to be set to "true". This change will cause staleness.

What's Changed in Cloudera Manager 5.12

  • Backup and Disaster Recovery
    • Replicate Impala metadata feature for CDH 5.12 or later

      Replication of “native” Impala UDFs (transient UDFs created with the old syntax prior to CDH 5.7) is no longer available for CDH clusters that run version 5.12 or later because Impala native UDFs are deprecated. If you still use Impala native UDFs and want them to be replicated, you should recreate them within the Hive Metastore using the new Create Function syntax supported since CDH 5.7.

    • Dynamic Resource Pool Scheduling Rules

      You can no longer specify a one-time scheduling rule for a dynamic resource pool. Recurring scheduling rules are still supported. To make one-time changed to resource pool configuration, simply update the pools via the Cloudera Manager UI or API.

  • Add Cluster Wizard

    MapReduce1 has been removed as an option from the Add Cluster Wizard. It can still be added to a cluster after initial cluster creation or through the API.

  • Web UI
    • Icons

      The icons in the Cloudera Manager web UI are more usable and distinct.

    • Stack Traces

      By default, stack traces are no longer displayed in the web UI. Please view the Cloudera Manager Server log under Diagnostics > Server Log to view stack traces of exceptions generated by Cloudera Manager.

  • Key Trustee Parcel

    The Key Trustee Parcel will no longer be released via archive.cloudera.com. The parcel will now be released via http://www.cloudera.com/downloads. Parcels already released on the archive site will continue to be available there.

  • Database names and DB usernames for the Cloudera Manager database.

    You can now only use alphanumeric characters and underscores for database names and usernames with the scm_prepare_database.sh script to ensure properly supported database names and usernames.

  • Cloudera Manager Agent Dependencies

    The Cloudera Manager Agent has new package dependencies. The netstat and ifconfig commands have been replaced with the ss and ip commands respectively.

    For more information, see iproute package requirement.

What's Changed in Cloudera Manager 5.11

  • Change to default fencing method for HDFS High Availability

    The default value of the HDFS High Availability Fencing Methods property (dfs.ha.fencing.methods) has been changed from shell(./cloudera_manager_agent_fencer.py to shell(true). While the old default value was reasonable for early versions of HDFS, it is no longer recommended for HAQJ-based versions of HDFS HA, as it can lead to HDFS service outages. The new default, shell(true), is the setting that Cloudera recommends. It uses the built-in HDFS fencing method, which causes the fenced NameNode to exit if it attempts a write operation when it is not supposed to be active. If auto-restart is enabled for the NameNode, Cloudera Manager will then restart it.

    Most clusters already have the HDFS High Availability Fencing Methods explicitly set to shell(true) since that is the value set by the HDFS Enable High Availability Wizard. When such clusters are upgraded to Cloudera Manager 5.11, this explicit setting of shell(true) will be removed. This does not change the effective value of the property since it takes on the new default value, which is also shell(true). The cluster will experience no change in fencing behavior.

    When clusters that have HDFS High Availability Fencing Methods set to the pre-5.11 default value are upgraded to Cloudera Manager 5.11, the effective value of the property changes to shell(true), and the fencing behavior changes accordingly. This also causes the HDFS service to become stale, requiring a restart of HDFS and dependent services. If HDFS High Availability Fencing Methods is set to a non-default value other than shell(true), no change occurs in the property's value when the cluster is upgraded to Cloudera Manager 5.11. Because the fencing method shell(./cloudera_manager_agent_fencer.py) can lead to service outages, a new configuration warning message will be displayed if it is in use after the upgrade to Cloudera Manager 5.11 is complete.

What's Changed in Cloudera Manager 5.10

  • Configuration Changes
    • Llama configuration options removed

      Llama roles must be removed before upgrading to CDH 5.10.0 or higher. If your cluster has an Impala Llama role and you are using Cloudera Manager to upgrade, Cloudera Manager displays an error message and prevents the upgrade from going forward. You must first remove any existing Llama roles, using the Disable YARN and Impala Integrated Resource Management command (Before upgrading, go to the Impala Service and select Actions > Disable YARN and Impala Integrated Resource Management.) Cloudera Manager also generates a configuration error if a Llama role is added to a CDH 5.10 cluster, and prevents Impala from starting until the role is removed.

    • JAVA_HOME in agent configuration deprecated in 5.x

      Specifying host agent environment variable CMF_AGENT_JAVA_HOME is deprecated and will not be supported in a future release. Instead , specify Java Home Directory property in Cloudera Manager.

    • The hbase.client.scanner.timeout.period property is now configurable for clients

      The Scan API-related property hbase.client.scanner.timeout.period in the HBase service is now configurable at the Client (Gateway) level. Its value may be set to be equal to or lesser than the RegionServer-side equivalent of HBase RegionServer Lease Period.

  • Warning if KeyTrustee Server is located on cluster machines

    The KeyTrustee Server should be installed on dedicated hosts in a separate cluster. While this is adequate for proof of concept deployments, Cloudera does not recommend such installations for use in a production environment. Cloudera now shows a warning to the user when adding a Key Trustee server service onto an existing CDH cluster host.

What's Changed in Cloudera Manager 5.9.0

  • You must be have the BDR, Cluster, or Full Administrator role to view the following pages:
    • HDFS File Browser
    • Directory Usage Report
    • HBase Table Browser
    • Solr Collections
    • HBase Table Statistics
    Previously, you only needed the Read-Only role.

What's Changed in Cloudera Manager 5.8.0

  • The YARN service's list of Allowed System Users now includes the hbase user by default. The reason for this change is that several essential HBase tools such as the MOB Sweeper, Import/Export tools, and CopyTable, need to interact with HBase as the hbase user to be able to execute MapReduce jobs.

    Note that this change is only applicable to new Cloudera Manager deployments. Upgrading to Cloudera Manager 5.8 will not add the hbase user to the list of defaults.

What's Changed in Cloudera Manager 5.7.0

  • The Navigator Metadata Server requires 192 MiB of Java PermGen space instead of 128 MiB. The value of this internal setting used by the JDK is increased automatically when upgrading to Cloudera Manager 5.7.
  • The default value for hive.compute.query.using.stats is changed to false. The reason for the change is that certain queries such as count, max, and min return incorrect results with this optimization on.
  • By default, Hive sessions now only consider sessions with no recent activity to be idle (hive.server2.idle.session.timeout_check_operation) and idle session timeouts have been reduced (hive.server2.idle.session.timeout and hive.server2.idle.operation.timeout). This helps reduce the strain on HiveServer2 from too many open sessions.
  • Cloudera Manager no longer automatically refreshes scheduler configurations when dynamic resource pool settings are changed. You must explicitly refresh the configurations. This allows you to schedule the changes to minimize the impact on your cluster.
  • For YARN, the default number of log directories (yarn.nodemanager.log-dirs) has changed from 1 to be equal to the number of mount points, to prevent applications with a large number of logs from filling up a single disk.
  • The default for Java Heap Size of JournalNode in Bytes is now 512 MB.
  • The Sources page for HDFS and Hive replications has been removed. A list of sources is available from a drop-down menu when you schedule a replication.
  • The number of watched directories you can specify for the Disk Usage Report is now unlimited.
  • Cloudera Manager now uses a new memory allocation algorithm to allocate memory when multiple roles are installed on the same host. See Memory.
  • User sessions in the Cloudera Manager Admin Console now timeout after a configurable period of time of inactivity. A dialog box warns the user before automatically logging out the user.
  • The All Recent Commands page now loads more quickly.
  • The Disk Usage reports now have links that take users to the Directory Usage Report with the correct filter applied.
  • When searching for hosts on the Hosts page, you can now filter the hosts list by entering search terms (hostname, IP address, or role) in the search box separated by commas or spaces. You can use quotes for exact matches (for example, strings that contain spaces, such as a role name) and brackets to search for ranges. Hosts that match any of the search terms are displayed.
  • Isilon is now supported as a source or destination service for HDFS replications.
  • For CDH 5.7 and higher if CDH_PYTHON is set by a Spark plug-in, PYSPARK_PYTHON is set to CDH_PYTHON in spark-env.sh. If you install a Python runtime parcel, such as the Anaconda parcel, Python Spark jobs run in both YARN client and YARN cluster modes are automatically configured by redeploying the Spark client configuration.

What's Changed in Cloudera Manager 5.5.0

  • Removed -XX:-CMSConcurrentMTEnabled from the default JVM options. This setting makes the JVM run in single threaded mode. This was needed for Java 1.6_31 and lower but not for Java 1.6_32 or higher. Anybody using Java 1.6_31 or lower should upgrade to the latest recommended version of Java 1.7.

    This change causes all roles to be stale after you upgrade to Cloudera Manager 5.5 and they are indicated as requiring restart in the Cloudera Manager Admin Console. However, as with any upgrade, this is a valid, functional Cloudera Manager state, and the cluster only needs to be restarted when you want the new configurations to take effect.

  • HADOOP_USER_CLASSPATH_FIRST is now adhered to in Hadoop client configurations. After you upgrade Cloudera Manager, services display a client configuration redeployment required icon .
  • For RHEL 7, the force_start, fast_*, clean_*, and hard_* commands on the server-scm-* services no longer work, as custom start, restart, and stop commands are not supported on systemd based distributions. These have been replaced with *_next_* operations, which do not trigger an immediate operation, but signal that the next invoked operation will be forced, fast, clean, or hard.
  • The Cloudera EULA is now shown when using the Cloudera Manager Admin Console for the first time.
  • The Home tab has been removed from the Cloudera Manager Admin Console navigation bar. You can return to the Home page Status tab by clicking the Cloudera Manager logo.
  • All the icons have been refreshed to make them cleaner and easier to read.
  • In 5.4.0 an externally assigned role was combined with a Cloudera Manager assigned role and the user had the union of the role privileges. As a consequence, an external user could be assigned an administrator role in Cloudera Manager and they would be an administrator regardless of the externally assigned role. Now only the externally assigned roles are respected. No roles can be assigned to an external user in Cloudera Manager and any roles for an external user in the Cloudera Manager are ignored.

    As a result of this change, external users with previously-assigned Cloudera Manager roles will have their permissions modified depending on the LDAP group they belong to. To restore permissions for external users, configure the LDAP groups for these users by navigating to Administration > Settings, and click Category > External Authentication to display the relevant properties.

  • Cloudera Manager and CDH components support TLS 1.0, TLS 1.1, and TLS 1.2, but not SSL 3.0. SSL remains part of the TLS/SSL name for historical reasons. For the complete list of supported versions, see CDH and Cloudera Manager Supported Transport Layer Security Versions.
  • The label on the Generate Credentials button has been changed to Generate Missing Credentials to better reflect the fact that it only creates Kerberos principals that are not present yet in Cloudera Manager.
  • Cloudera Manager now downloads binaries from https://archive.cloudera.com instead of https://archive.cloudera.com.
  • The embedded hbck feature has been removed from HBase monitoring for stability reasons.
  • Increased the default heap sizes for Hive roles. On clusters with sufficient memory, newly created Hive roles have these values:
    • HiveServer2 - 4 G heap, 512 M perm gen
    • Hive Metastore - 8 G heap, 512 M perm gen
    • Gateway - 2 G heap, 512 M perm gen
  • By default, Oozie now purges eligible completed workflows and coordinator actions for long-running coordinator jobs.
  • Oozie actions that omit the <job-tracker> and <name-node> elements (and the workflow does not define them in the <global> section) use the default values for the JobTracker, Resource Manager, and NameNode from Cloudera Manager in CDH 5.5 and higher.
  • Increased the defaults for Oozie parameters:
    • oozie.service.CallableQueueService.callable.concurrency - 10
    • oozie.service.CallableQueueService.threads - 50
  • Sqoop 2 is no longer in the default services to be created in any of the options in the installation wizard. You can choose to add it to the Custom Services option in the Installation wizard or can add it with the Add Service wizard after installation.
  • For CDH 5.5.0 and higher the default values of the YARN properties mapreduce.[map|reduce].java.opts.max.heap and mapreduce.[map|reduce].memory.mb have been changed to 0, which tells YARN to automatically select a default. This helps avoid issues where either heap or memory.mb is updated, but not the other one (memory.mb should be ~30% higher than heap to allow for JVM overhead).
  • The Host DNS Resolution Duration health test was removed. Its functionality is now covered in the Host DNS Resolution health test.
  • The default Replication Strategy is now Dynamic.

What's Changed in Cloudera Manager 5.4.1

HDFS Read Throughput Impala query monitoring property is misleading

The hbase_bytes_read_per_second and hdfs_bytes_read_per_second Impala query properties have been renamed to hbase_scanner_average_bytes_read_per_second and hdfs_scanner_average_bytes_read_per_second to more accurately reflect that these properties return the average throughput of the query's HBase and HDFS scanner threads respectively. The previous names and descriptions gave the impression that these properties were the query's total HBase and HDFS throughput, which was not accurate.

What's Changed in Cloudera Manager 5.4.0

  • Cloudera Manager checks the specified version of CDH before an installation and upgrade to ensure that it is compatible with Cloudera Manager before proceeding. Specifically, for Cloudera Manager 5.4 that means no version of CDH newer than 5.4.x is supported (Cloudera Manager must be upgraded before upgrading to such a version of CDH). Cloudera Manager no longer shows these "too-new" versions of CDH. The 'latest' parcel repository URL will be replaced by the 'latest_supported' repository in the parcel configuration.
  • The minimum Java heap size for the Activity Monitor, Host Monitor, and Service Monitor has been changed from 50 MB to 256 MB.
  • Regenerating Kerberos principals will be denied if any roles that are using those principals are running. Stop those roles and then attempt to regenerate the principals.
  • In previous versions of Cloudera Manager, the 'version' attribute in tsquery had values that were integers, for example, 4 for CDH4, 5 for CDH5, -1 for Cloudera Manager. Starting in the Cloudera Manager 5.4, the values for the 'version' attribute are in release string format, for example "cdh5.0.0".
  • Hive
    • hive.exec.reducers.max default value changed from 999 to 1099
    • hive.exec.reducers.bytes.per.reducer default value changed from 1 GB to 64 MB
    • The default heap size for the Hive CLI is increased to 1 GB.
    • The property hive.log.explain.output is known to create instability of Cloudera Manager Agents in some specific circumstances, specially when the hive queries generate extremely large EXPLAIN outputs. Therefore, the property has been hidden from the Cloudera Manager configuration UI. The property can still be configured through the use of advanced configuration snippets.
  • Impala - The Impala Daemon now supports the Impala Maximum Log Files property which specifies the total number of log files per severity level that should be retained before they are deleted. By default, after upgrading to CDH 5.4 this property is set to 10, which means that Impala Daemons will only retain up to 10 log files for each severity level. Any additional files will be deleted.
  • HBase - Moved three settings for HBase coprocessors from Main to Advanced category:
    • Service Wide > HBase Coprocessor Abort on Error: move to 'Service Wide > Advanced > HBase Coprocessor Abort on Error'
    • 'Master Default Group > HBase Coprocessor Master Classes': move to 'Master Default Group > Advanced > HBase Coprocessor Master Classes'
    • RegionServer Default Group > HBase Coprocessor Region Classes': move to 'RegionServer Default Group > Advanced > HBase Coprocessor Region Classes'

What's Changed in Cloudera Manager 5.3.2

  • Turning on the internal HBase canary (not to be confused with Cloudera Manager monitoring canary) is optional. On new clusters, it will not be enabled by default. Existing clusters will continue to run the canary until it is disabled from the HBase configuration page.

What's Changed in Cloudera Manager 5.3.0

  • Cloudera Manager upgrade - If you have any active commands running before upgrade, the server will fail to start after upgrade. This includes commands a user might have run and also for commands Cloudera Manager automatically triggers, either in response to a state change, or something that's on a schedule.

What's Changed in Cloudera Manager 5.2.1

  • The default value of the YARN yarn.nodemanager.recovery.dir property has changed from {hadoop.tmp.dir}/yarn-nm-recovery to /var/lib/hadoop-yarn/yarn-nm-recovery.

What's Changed in Cloudera Manager 5.2.0

  • Rolling upgrade - As a result of a recent change in the way DataNodes handle block deletions during a rolling upgrade (HDFS-5907), the Trash directory may grow unexpectedly while the upgrade is in progress. Deleted blocks are kept during upgrade in case you want to roll back. The blocks are cleaned up after you finalize the upgrade.
  • Agent -
    • The hard_stop, hard_restart, and clean_restart commands now show a warning message about the impact of using these commands instead of performing the actions. To actually perform the actions, you use the hard_stop_confirmed, hard_restart_confirmed, and clean_restart_confirmed commands.
    • The default supervisord port is changed from 9001 to 19001
  • YARN application attributes renamed: slot_millis to slots_millis and fallow_slot_millis to fallow_slots_millis

What's Changed in Cloudera Manager 5.1.0

  • UI refresh for scalability
  • Revised authorization privilege model in Sentry. See Privilege Model.

What's Changed in Cloudera Manager 5.0.0

  • MapReduce now inherits topology from HDFS NameNode. Topology configuration for MapReduce JobTracker was removed. The configuration was redundant and the two parameters should always have been set to the same value.
  • UI
    • The Clusters tab no longer has Activities, Other, and Manage Resources sections.

What's Changed in Cloudera Manager 5.0.0 Beta 2

  • Product
    • Cloudera Backup and Disaster Recovery (BDR) is now included with Cloudera Enterprise.
    • Cloudera Standard has been renamed to Cloudera Express.
  • OS and packaging
    • The name of the Cloudera Manager embedded database package has changed from cloudera-manager-server-db to cloudera-manager-server-db-2. For details, read the upgrade and install topics for your OS.
    • Support for Ubuntu 10.04 and Debian 6.0 is deprecated.
  • HDFS - enabling High Availability automatically enables auto-failover, unlike in Cloudera Manager 4 where enable auto-failover was a separate command.
  • HBase
    • In CDH 5 there is no HBase canary because HBase is now monitored by a watchdog process. In CDH 4, the HBase canary is still used.
    • The RegionServer default heap size has been increased to 4GB.
  • Monitoring
    • Chart "Views" and actions related to views have been renamed to "Dashboard".
    • Changes to how attribute filters are displayed in the Impala queries and YARN applications screens
    • The outdated configuration indicator on the Home, service, and role pages has a new graphic and now has a tooltip that displays whether a cluster refresh or restart is required. There is a new indicator for changes that require redeploying client configurations. You can click an indicator to go to the new Stale Configurations page to view and resolve the conditions that gave rise to the indicator.
    • To match the naming convention of tsquery metrics, multiword Impala query and YARN application attribute names have changed from camel case to using an underscore separator. For example queryType has changed to query_type. For backward compatibility, camel case names are still supported.
  • UI
    • The main navigation bar in Cloudera Manager Admin Console has been reorganized. The Services tab has been replaced by a Clusters tab that contains links to individual services, which were previously under the Services tab, Activities and Reports sections, which were removed from the main bar, and a new Manage Resources section, which contains links to the new resource pools and service pools features. The All Services page has been removed.
    • The "Safety Valve" properties have been renamed "Advanced Configuration Snippet".
    • The screen for specifying assignment of roles to hosts has been redesigned for improved scalability and usability.
  • Misc
    • The io.compression.codecs property has moved from MapReduce to HDFS.

What's Changed in Cloudera Manager 5.0.0 Beta 1

  • When CDH 5 is installed, YARN is installed by default, rather than MapReduce, and is the default execution environment. MapReduce is deprecated in CDH 5 but is fully supported for backward compatibility through CDH 5. In CDH 4, MapReduce is still the default.
  • The setting for yarn.scheduler.maximum-allocation-mb has been increased to a default of 64GB.
  • The minimum heap size for the Solr service has been increased to 200MB (from 50MB previously) to enable it to better handle collection creation.