This is the documentation for Cloudera Manager 5.0.2.
Documentation for other versions is available at Cloudera Documentation.

New Features and Changes in Cloudera Manager 5

The following sections describe what’s new and changed in each Cloudera Manager 5 release.

What's New in Cloudera Manager 5

What's New in Cloudera Manager 5.0.2

An issue has been fixed. See Fixed Issues in Cloudera Manager 5.0.2 for details.

What's New in Cloudera Manager 5.0.1

A number of issues have been fixed. See Fixed Issues in Cloudera Manager 5.0.1 for details.

What's New in Cloudera Manager 5.0.0

  • Service and Configuration Management
    • HDFS - cache management
  • Resource Management - Impala admission control
  • Monitoring
    • Host disks overview
    • Impala best practices
    • HBase table statistics
    • HDFS cache statistics

What's New in Cloudera Manager 5.0.0 Beta 2

  • Service and Configuration Management
    • HDFS
      • HDFS NFS Gateway role
      • Supports restoration of HDFS data from a snapshot
    • YARN
      • YARN Resource Manager High Availability
      • Resource pool scheduler
    • Support for Spark service
    • Support for Accumulo service
    • Support for service extensibility
    • Support to set up Oozie server High Availability
    • Granular configuration staleness UI
    • Support for setting maximum file descriptors
  • Monitoring
    • Support for monitoring the Cloudera Search/Solr service
    • New "failed" and "killed" badges displayed for unsuccessful YARN applications
    • More attributes available for filtering displays of YARN applications and Impala queries
    • New operational reports added for HBase tables and namespaces, Impala queries, and YARN applications
    • Support for creating user-defined triggers for metrics accessible via charts/tsquery
        Important: Because triggers are a new and evolving feature, backward compatibility between releases is not guaranteed at this time.
    • Charting improvements
      • New table chart type
      • New options for displaying data and metadata from charts
      • Support for exporting data from charts to CSV or JSON files
  • Administrative Settings
    • Added a new role type with limited administrator capabilities
    • Cloudera Manager Server and all JVMs will create a heap dump if they run out of memory
    • Configure the location of the parcel directory and specify whether and when to remove old parcels from cluster hosts

What's New in Cloudera Manager 5.0.0 Beta 1

  • CDH Version
    • Supports both CDH 4 and CDH 5
    • CDH 4 to CDH 5 upgrade wizard
    • Support for YARN as a production execution environment
      • MapReduce (MRv1) to YARN (MRv2) configuration import
      • YARN-based resource management for Impala 1.2
  • JDK Version - Cloudera Manager 5 supports and installs both JDK 6 and JDK 7.
  • Resource Management
    • Static and dynamic partitioning of resources: provides a wizard for configuring static partitioning of resources (cgroups) across core services (HBase, HDFS, MapReduce, Solr, YARN) and dynamic allocation of resources for YARN and Impala.
    • Pool, resource group, and queue administration for YARN and Impala.
    • Usage monitoring and trending
  • Monitoring
    • YARN service monitoring
    • YARN (MRv2) job monitoring
    • Configurable histograms of Impala query and YARN job attributes that can be used to quickly filter query and application lists
    • Scalable back-end database for monitoring metrics
    • Charting improvements
      • New chart types: histogram and heatmap
      • New scale types: logarithmic and power
      • Updates to tsquery language: new attribute values to support YARN and new functions to support new chart types
  • Extensibility
    • Ability to manage both ISV applications and non-CDH services (for example, Accumulo, Spark, and so on)
    • Working with select ISVs as part of Beta 1
  • Single Sign-On - Support for SAML to enable single sign-on
  • Parcels
    • Dependency enforcement to ensure incompatible parcels are not used together
    • Option to not cache downloaded parcels, to save disk space
    • Improved error reporting for management operations
  • Backup and Disaster Recovery (BDR)
    • HBase and HDFS snapshots: Supports scheduling snapshots on a recurring basis.
    • Support for YARN (MRv2): Replication jobs can now run using YARN (MRv2) instead of MRv1.
    • Global replication page: All scheduled snapshots (HDFS and HBase) and replication jobs for either HDFS or Hive are shown on a single Replications page
  • Other
    • Global Search box
    • Several usability improvements
    • Comprehensive detection of configuration changes that require service restarts, refresh and redeployment of client configurations.

Incompatible Changes in Cloudera Manager 5

The following sections describe incompatible changes in each Cloudera Manager 5 release.

Incompatible Changes Introduced in Cloudera Manager 5.0.0

  • Cloudera Manager API
    • New upgradeCdh command, which upgrades CDH cluster versions. Use this command to upgrade clusters from CDH 4 to CDH 5. The upgradeServices command previously used to upgrade CDH cluster versions is no longer supported.
    • The hostId field now contains a unique UUID and no longer matches the hostName field. When referring to a host, both hostId and hostName are accepted. However, any API clients that were previously cross-referencing host records with external information by hostName, but were using the hostId field in the API, must be updated to use the hostName field. Clients updated in this manner will function correctly with older versions of Cloudera Manager because the hostName field has always been present.
    • The clusterName field displayed when viewing service and role references is now an internal name and may not match the external displayNamefield of the cluster.
  • CDH 5 Hue requires Python 2.6 and above, effectively dropping support for Python 2.4 and 2.5. Hue will install without Python 2.6, but will not start.
  • Cloudera Manager 5.0 includes a change to the value of the snmpTrapOID. Earlier releases set the value of snmpTrapOID (OID: .1.3.6.1.6.3.1.1.4.1.0) wrongly to clouderaManagerMIBNotifications (OID .1.3.6.1.4.1.38374.1.1.1). This is fixed in Cloudera Manager 5.0 with the correct value, which is clouderaManagerAlert (OID .1.3.6.1.4.1.38374.1.1.1.1). This change will break SNMP server setups that are configured to expect clouderaManagerMIBNotifications. Cloudera Manager administrators should configure their SNMP receivers to accept the corrected OID.
  • The default values for the following configurations have changed to include the JVM option -Djava.net.preferIPv4Stack=true, which sets the preferred protocol stack to IPv4 on dual-stack machines. Any values set to the old defaults will automatically be changed to the new default when upgrading to Cloudera Manager 5.
    • MapReduce client configuration:
      • hadoop-env.sh: added to HADOOP_CLIENT_OPTS
      • mapred-site.xml: added to mapred.child.java.opts
    • YARN client configuration:
      • hadoop-env.sh: added to YARN_OPTS
      • mapred-site.xml: added to yarn.app.mapreduce.am.command-opts, mapreduce.map.java.opts, and mapreduce.reduce.java.opts
    • HDFS client configuration: hadoop-env.sh: added to HADOOP_CLIENT_OPTS
    • Hive client configuration: hive-env.sh: added to HADOOP_CLIENT_OPTS
  • MapReduce health tests have been removed:
    • Job failure
    • Map backlog
    • Reduce backlog
    • Map locality
    If needed, the test can be replaced with a trigger. For example:
    • Looks at all the jobs that completed in the last hour and if there are more than 10% of failed jobs, change the health of the service to concerning:
      IF (select (jobs_failed_rate * 3600) as jobs_failed, ((jobs_failed_rate + jobs_completed_rate + jobs_killed_rate) * 3600) as all_jobs where roleType=JOBTRACKER AND serviceName=$SERVICENAME and last(jobs_failed_rate / (jobs_failed_rate + jobs_completed_rate + 
      jobs_killed_rate)) >= 10 ending at $END_TIME duration "PT3600S") DO health:concerning
    • If there are more than 50% maps waiting than total slots available, health goes concerning.
      IF (select waiting_maps / map_slots where roleType=JOBTRACKER and serviceName=$SERVICENAME and last(waiting_maps / map_slots) > 50) DO health:concerning
    • If there are more than 50% reduce waiting than total slots available, health goes concerning.
      IF (select waiting_reduces / reduce_slots where roleType=JOBTRACKER and serviceName=$SERVICENAME and last(waiting_reduces / reduce_slots) > 50) DO health:concerning
  • HDFS checkpointing metrics have been removed:
    • end_checkpoint_num_ops
    • end_checkpoint_avg_time
    • start_checkpoint_num_ops
    • start_checkpoint_avg_time

Incompatible Changes Introduced in Cloudera Manager 5.0.0 Beta 2

  • Impala releases earlier than 1.2.1 are no longer supported.
  • Some of the constants identifying health tests have changed. The following existed in Cloudera Manager 4:
    • FAILOVERCONTROLLER_FILE_DESCRIPTOR
    • FAILOVERCONTROLLER_HOST_HEALTH
    • FAILOVERCONTROLLER_LOG_DIRECTORY_FREE_SPACE
    • FAILOVERCONTROLLER_SCM_HEALTH
    • FAILOVERCONTROLLER_UNEXPECTED_EXITS

    They are now:

    • MAPREDUCE_FAILOVERCONTROLLER_FILE_DESCRIPTOR
    • MAPREDUCE_FAILOVERCONTROLLER_HOST_HEALTH
    • MAPREDUCE_FAILOVERCONTROLLER_LOG_DIRECTORY_FREE_SPACE
    • MAPREDUCE_FAILOVERCONTROLLER_SCM_HEALTH
    • MAPREDUCE_FAILOVERCONTROLLER_UNEXPECTED_EXITS

    and

    • HDFS_FAILOVERCONTROLLER_FILE_DESCRIPTOR
    • HDFS_FAILOVERCONTROLLER_HOST_HEALTH
    • HDFS_FAILOVERCONTROLLER_LOG_DIRECTORY_FREE_SPACE
    • HDFS_FAILOVERCONTROLLER_SCM_HEALTH
    • HDFS_FAILOVERCONTROLLER_UNEXPECTED_EXITS

    The reason for the change is to better distinguish between MapReduce and HDFS failover controller monitoring in the health system.

Incompatible Changes Introduced in Cloudera Manager 5.0.0 Beta 1

  • Services
    • Impala - With Cloudera Manager 4.8 (released in late November 2013), only Impala 1.2.1 is supported, due to the introduction of the Impala Catalog Server. However, CDH 5.0.0 Beta 1 was released with Impala 1.2.0 (Beta). Therefore, if you upgrade from Cloudera Manager 4.8 (with Impala 1.2.1) to Cloudera Manager 5.0.0 Beta 1, and then upgrade your CDH to CDH 5.0.0 Beta 1, your version of Impala will be downgraded to Impala 1.2.0 from 1.2.1. This will result in some loss of functionality. See New Features in Impala for a list of the new features in Impala 1.2.1 that are not in Impala 1.2.0 (Beta).
    • Hive - HiveServer2 is a mandatory role for Hive in CDH 5.
    • Hue - In CDH 5, Hue no longer has a Beeswax Server role. Hue now submits queries to HiveServer2.
    • HDFS - Cloudera Manager 5 does not support NFS-mounted shared edits directories for HDFS High Availability. It only supports the Quorum Journal method for shared edits. If you upgrade from Cloudera Manager 4 with a working CDH 4 High Availability configuration that uses NFS-mounted directories, your installation will continue to work until you disable High Availability. You will not be able to re-enable High Availability with NFS-mounted directories. Furthermore, you will not be able to upgrade to CDH 5 unless you disable High Availability, and you will need to use Quorum-based storage in order to re-enable High Availability after the upgrade.
    • YARN
      • The YARN (MRv2) configuration mapreduce.job.userlog.retain.hours has been replaced by yarn.log-aggregation.retain-seconds. Any existing value in mapreduce.job.userlog.retain.hours will be lost. However, this configuration never had any effect, so no functionality is affected.
      • The following configuration parameters were removed from YARN. These never had any effect, so no functionality is affected.
        • mapreduce.jobtracker.maxtasks.perjob
        • mapreduce.jobtracker.handler.count (non-functional duplicate of yarn.resourcemanager.resource-tracker.client.thread-count)
        • mapreduce.jobtracker.persist.jobstatus.active
        • mapreduce.jobtracker.persist.jobstatus.hours
        • mapreduce.job.jvm.numtasks
      • The following YARN configuration parameters were replaced. Only the YARN parameters were replaced. Old configurations will be lost, but they never had any effect so this does not affect functionality.
        • mapreduce.jobtracker.restart.recover replaced by yarn.resourcemanager.recovery.enabled (changed from Gateway to ResourceManager)
        • mapreduce.tasktracker.http.threads replaced by mapreduce.shuffle.max.connections
        • mapreduce.jobtracker.staging.root.dir replaced by yarn.app.mapreduce.am.staging-dir
      • Cloudera Manager 5 sets the default YARN Resource Scheduler to FairScheduler. If a cluster was previously running YARN with the FIFO scheduler, it will be changed to FairScheduler the next time YARN restarts. The FairScheduler is only supported with CDH 4.2.1 and later, and older clusters may hit failures and need to manually change the scheduler to FIFO or CapacityScheduler. See the Known Issues section of this Release Note for information on how to change the scheduler back to FIFO or CapacityScheduler.

Changed Features and Behaviors in Cloudera Manager 5

The following sections describe what’s changed in each Cloudera Manager 5 release.

  Note: Rolling upgrade is not supported between CDH 4 and CDH 5. Rolling upgrade will also not be supported from CDH 5.0.0 Beta 2 to any later releases, and may not be supported between any future beta versions of CDH 5 and the General Availability release of CDH 5.

What's Changed in Cloudera Manager 5.0.0

  • MapReduce now inherits topology from HDFS NameNode. Topology configuration for MapReduce JobTracker was removed. The configuration was redundant and the two parameters should always have been set to the same value.
  • UI
    • The Clusters tab no longer has Activities, Other, and Manage Resources sections.

What's Changed in Cloudera Manager 5.0.0 Beta 2

  • Product
    • Cloudera Backup and Disaster Recovery (BDR) is now included with Cloudera Enterprise.
    • Cloudera Standard has been renamed to Cloudera Express.
  • OS and packaging
    • The name of the Cloudera Manager embedded database package has changed from cloudera-manager-server-db to cloudera-manager-server-db-2. For details, read the upgrade and install topics for your OS.
    • Support for Ubuntu 10.04 and Debian 6.0 is deprecated.
  • HDFS - enabling High Availability automatically enables auto-failover, unlike in Cloudera Manager 4 where enable auto-failover was a separate command.
  • HBase
    • In CDH 5 there is no HBase canary because HBase is now monitored by a watchdog process. In CDH 4, the HBase canary is still used.
    • The RegionServer default heap size has been increased to 4GB.
  • Monitoring
    • Chart "Views" and actions related to views have been renamed to "Dashboard".
    • Changes to how attribute filters are displayed in the Impala queries and YARN applications screens
    • The outdated configuration indicator on the Home, service, and role pages has a new graphic and now has a tooltip that displays whether a cluster refresh or restart is required. There is a new indicator for changes that require redeploying client configurations. You can click an indicator to go to the new Stale Configurations page to view and resolve the conditions that gave rise to the indicator.
    • To match the naming convention of tsquery metrics, multiword Impala query and YARN application attribute names have changed from camel case to using an underscore separator. For example queryType has changed to query_type. For backward compatibility, camel case names are still supported.
  • UI
    • The main navigation bar in Cloudera Manager Admin Console has been reorganized. The Services tab has been replaced by a Clusters tab that contains links to individual services, which were previously under the Services tab, Activities and Reports sections, which were removed from the main bar, and a new Manage Resources section, which contains links to the new resource pools and service pools features. The All Services page has been removed.
    • The "Safety Valve" properties have been renamed "Advanced Configuration Snippet".
    • The screen for specifying assignment of roles to hosts has been redesigned for improved scalability and usability
  • Misc
    • The io.compression.codecs property has moved from MapReduce to HDFS

What's Changed in Cloudera Manager 5.0.0 Beta 1

  • When CDH 5 is installed, YARN is installed by default, rather than MapReduce, and is the default execution environment. MapReduce is deprecated in CDH 5 but is fully supported for backward compatibility through CDH 5. In CDH 4, MapReduce is still the default.
  • The setting for yarn.scheduler.maximum-allocation-mb has been increased to a default of 64GB.
  • The minimum heap size for the Solr service has been increased to 200MB (from 50MB previously) to enable it to better handle collection creation.