Creating a Custom Cluster Utilization Report

Cloudera Manager provides a Cluster Utilization Report that displays aggregated utilization information for YARN and Impala jobs. If you wish to export the data from this report, you can build custom reports based on the same metrics data using the Cloudera Manager Admin console or the Cloudera Manager API. This topic describes the metrics and queries you can use to build these custom reports. These reports all use the tsquery Language to chart time-series data.

Metrics and Queries

For more information about the Data Granularity described in these metrics, see Metric Aggregation.

Many of the metrics described below use a data granularity of hourly. This is not required, but is recommended because some of the YARN utilization metrics are only available hourly and using the hourly granularity allows for consistent reporting.

Cluster-Level CPU and Memory Metrics

Total cluster CPU usage

Data Granularity: hourly

Units: percentage

tsquery:
SELECT
   cpu_percent_across_hosts
WHERE
   category=CLUSTER
   AND clusterName=Cluster_Name
Total CPU Cores in the cluster

Data Granularity: hourly

Units: CPU cores

tsquery:
SELECT
   total_cores_across_hosts
WHERE
   category=CLUSTER
   AND clusterName=Cluster_Name
Total cluster memory usage

Data Granularity: hourly

Units: percentage

tsquery:
SELECT
   100 * total_physical_memory_used_across_hosts/total_physical_memory_total_across_hosts
WHERE
   category=CLUSTER
   AND clusterName=Cluster_Name
Total cluster memory usage

Time series of total cluster memory usage.

Data Granularity:hourly

Units: Byte seconds

tsquery:
SELECT
   total_physical_memory_total_across_hosts
WHERE
   category=CLUSTER
   AND clusterName=Cluster_Name
CPU used by Impala

Time series of total Impala CPU usage in milliseconds.

Data Granularity: hourly

Units: milliseconds

tsquery:
SELECT
   counter_delta(impala_query_thread_cpu_time_rate)
WHERE
   category=CLUSTER
   AND clusterName=Cluster_Name
Memory used by Impala

Time series of Impala memory usage

Data Granularity: hourly

Units: byte seconds

tsquery:
SELECT
   counter_delta(impala_query_memory_accrual_rate)
WHERE
   category=CLUSTER
   AND clusterName=Cluster_Name
CPU used by YARN

The yarn_reports_containers_used_cpu metric used in this tsquery is generated per hour, therefore the data granularity used for this query is the raw metric value.

Data Granularity: Raw

Units: percent seconds

tsquery:
SELECT
   yarn_reports_containers_used_cpu FROM REPORTS
WHERE
   category=SERVICE
   AND clusterName=Cluster_Name
Memory used by YARN

Yarn memory usage. The yarn_reports_containers_used_memory metric used in this tsquery is generated per hour, therefore the data granularity used for this query is the raw metric value.

Data Granularity: raw metric value

Units: megabyte seconds

tsquery:
SELECT
   yarn_reports_containers_used_memory
FROM
   REPORTS
WHERE
   category=SERVICE
   AND clusterName=Cluster_Name

Pool-Level CPU and Memory Metrics

CPU used by Impala pool

CPU usage for an Impala pool.

Data Granularity: hourly

Units: milliseconds

tsquery:
SELECT
   counter_delta(impala_query_thread_cpu_time_rate)
WHERE
   category=IMPALA_POOL
   AND poolName=Pool_Name
Memory used by Impala pool

Data Granularity: hourly

Units: byte seconds

tsquery:
SELECT
   counter_delta(impala_query_memory_accrual_rate)
WHERE
   category=IMPALA_POOL
   AND poolName=Pool_Name
CPU used by YARN pool

Provides CPU metrics per YARN pool and user. You can aggregate a pool-level metric from this query.

Data Granularity: Raw

Units: percent seconds

tsquery:
SELECT
   yarn_reports_containers_used_cpu FROM REPORTS
WHERE
   category=YARN_POOL_USER
Memory used by YARN pool

Provides memory metrics per YARN pool and user. You can aggregate a pool-level metric from this query.

Data Granularity: hourly

Units: megabyte seconds

tsquery:
SELECT
   yarn_reports_containers_used_memory
FROM
   REPORTS
WHERE
   category=YARN_POOL_USER

YARN Metrics

YARN VCore usage

Data Granularity: Raw

Units: VCore seconds

tsquery:
SELECT
   yarn_reports_containers_used_vcores
FROM
   REPORTS
WHERE
   category=SERVICE
   AND clusterName=Cluster_Name
Total VCores available to YARN

Data Granularity: hourly

Units: Number of VCores (Note that this value is not multiplied by the time unit.)

tsquery:
SELECT
   total_allocated_vcores_across_yarn_pools + total_available_vcores_across_yarn_pools
WHERE
   category=SERVICE
   AND clusterName=Cluster_Name
YARN Memory usage

Data Granularity: Raw

Units: MB seconds

tsquery:
SELECT
   yarn_reports_containers_used_memory FROM REPORTS
WHERE
   category=SERVICE
   AND clusterName=Cluster_Name
Total memory available to YARN

Data Granularity: hourly

Units: MB (Note that this value is not multiplied by the time unit.)

tsquery:
SELECT
   total_available_memory_mb_across_yarn_pools + total_allocated_memory_mb_across_yarn_pools
WHERE
   category=SERVICE
   AND clusterName=Cluster_Name
Pool-level VCore usage

The results of this query return the usage for each user in each pool. To see the total usage for a pool, sum all users of the pool.

Data Granularity: Raw

Units: VCore seconds

tsquery:
SELECT
   yarn_reports_containers_used_vcores FROM REPORTS
WHERE
   category=YARN_POOL_USER

To view metrics for a specific pool, add poolName=Pool Name to the tsquery statement.

Pool-level memory usage

The results of this query return the usage for each user in each pool. To see the total usage for a pool, sum all users of the pool.

Data Granularity: Raw

Units: MB seconds

tsquery:
SELECT
   yarn_reports_containers_used_memory FROM REPORTS
WHERE
   category=YARN_POOL_USER

To view metrics for a specific pool, add poolName=Pool Name to the tsquery statement.

Pool-level allocated VCores

The results of this query return the usage for each user in each pool. To see the total usage for a pool, sum all users of the pool.

Data Granularity: raw metric value

Units: VCore seconds

tsquery:
SELECT
   yarn_reports_containers_allocated_vcores FROM REPORTS
WHERE
   category=YARN_POOL_USER

To view metrics for a specific pool, add poolName=Pool Name to the tsquery statement.

Pool-level allocated memory

The results of this query return the usage for each user in each pool. To see the total usage for a pool, sum all users of the pool.

Data Granularity: raw metric value

Units: megabyte seconds

tsquery:
SELECT
   yarn_reports_containers_allocated_memory
FROM
   REPORTS
WHERE
   category=YARN_POOL_USER

To view metrics for a specific pool, add poolName=Pool Name to the tsquery statement.

Pool-level steady fair share VCore

Data Granularity: hourly

Units: VCores

tsquery:
SELECT
   steady_fair_share_vcores
WHERE
   category=YARN_POOL

To view metrics for a specific pool, add poolName=Pool Name to the tsquery statement.

Pool-level fair share VCore

Data Granularity: hourly

Units: VCores

tsquery:
SELECT
   fair_share_vcores
WHERE
   category=YARN_POOL
Pool-level steady fair share memory

Data Granularity: hourly

Units: MB

tsquery:
SELECT
   steady_fair_share_mb
WHERE
   category=YARN_POOL

To view metrics for a specific pool, add poolName=Pool Name to the tsquery statement.

Pool-level fair share memory

Data Granularity: hourly

Units: MB

tsquery:
SELECT
   fair_share_mb
WHERE
   category=YARN_POOL

To view metrics for a specific pool, add poolName=Pool Name to the tsquery statement.

Metric indicating contention

Data Granularity: hourly

Units: percentage

tsquery:
SELECT
   container_wait_ratio
WHERE
   category=YARN_POOL

To view metrics for a specific pool, add poolName=Pool Name to the tsquery statement.

YARN Contention-Related Metrics

Use the following metrics to monitor resource contention.

Pool-level allocated VCores when contention occurs

Data Granularity: hourly

Units: VCores

tsquery:
SELECT
   allocated_vcores_with_pending_containers
WHERE
   category=YARN_POOL

To view metrics for a specific pool, add poolName=Pool Name to the tsquery statement.

Pool level steady fair share VCores when contention occurs

Data Granularity: hourly

Units: VCores

tsquery:
SELECT
   steady_fair_share_vcores_with_pending_containers
WHERE
   category=YARN_POOL

To view metrics for a specific pool, add poolName=Pool Name to the tsquery statement.

Pool level fair share VCores when contention occurs

Data Granularity: hourly

Units: VCores

tsquery:
SELECT
   fair_share_vcores_with_pending_containers
WHERE
   category=YARN_POOL

To view metrics for a specific pool, add poolName=Pool Name to the tsquery statement.

Pool level allocated memory when contention occurs

Data Granularity: hourly

Units: MB

tsquery:
SELECT
   allocated_memory_mb_with_pending_containers
WHERE
   category=YARN_POOL

To view metrics for a specific pool, add poolName=Pool Name to the tsquery statement.

Pool level steady fair share memory when contention occurs

Data Granularity: hourly

Units: MB

tsquery:
SELECT
   steady_fair_share_mb_with_pending_containers
WHERE
   category=YARN_POOL

To view metrics for a specific pool, add poolName=Pool Name to the tsquery statement.

Pool level fair share memory when contention occurs

Data Granularity: hourly

Units: MB

tsquery:
SELECT
   fair_share_mb_with_pending_containers
WHERE
   category=YARN_POOL

To view metrics for a specific pool, add poolName=Pool Name to the tsquery statement.

Impala-Specific Metrics

To view metrics for a specific pool, add poolName=Pool Name to the tsquery statement.

Total reserved memory

Data Granularity: hourly

Units: MB seconds

tsquery:
SELECT
   total_impala_admission_controller_local_backend_mem_reserved_across_impala_daemon_pools
WHERE
   category=CLUSTER
   AND clusterName=Cluster_Name
Total used memory

Data Granularity: hourly

Units: MB seconds

tsquery:
SELECT
   total_impala_admission_controller_local_backend_mem_usage_across_impala_daemon_pools
 WHERE
   category=CLUSTER
   AND clusterName=Cluster_Name
Total available memory

Data Granularity: hourly

Units: MB seconds

tsquery:
SELECT
   total_mem_tracker_process_limit_across_impalads
WHERE
   category=CLUSTER
   AND clusterName=Cluster_Name

Impala Query Counter Metrics

Include the following in the SELECT statement of the tsquery to get information about the rate of Impala queries:
  • counter_delta(queries_ingested_rate)
  • counter_delta(queries_successful_rate)
  • counter_delta(queries_rejected_rate)
  • counter_delta(queries_oom_rate)
  • counter_delta(queries_timed_out_rate)
  • counter_delta(impala_query_admission_wait_rate)
  • counter_delta(impala_query_memory_spilled_rate)

For example:

SELECT
   counter_delta(queries_ingested_rate)
 WHERE
   category=IMPALA_POOL
   AND clusterName=Cluster_Name
   AND serviceName=Service_Name

Calculations for reports

All the metrics listed in this topic return a time series of metric values. Depending on the collection frequency of the metric itself and the data granularity you use when issuing tsquery statements, the results return metric values in different frequencies and therefore there are different ways to handle the metric values.

Note the following about how to correctly perform calculations using metric values:
  • YARN container metrics are generated once per hour resulting in one raw metric value every hour. Therefore, the most detailed results possible for YARN CPU and memory usage are hourly reports.

  • Hourly aggregates are summarized from raw metric values. These aggregates include a set of statistics that include the sum, maximum, minimum, count and other statistics that summarize the raw metric values. When you use the hourly granularity, you lose the single values of the raw metric values. However, you can still get peak usage data for such metrics. For more information, see Metric Aggregation.

  • For some of the YARN metrics described in this topic, the tsquery statement aggregates from the pool and user level to pool level in the Cloudera Manager Cluster Utilization reports. For these queries, because the maximum and minimum for different pool and user combinations are not likely to happen at the same time, there is no way to get the peak usage across pool and user combinations, or at the pool level. The only meaningful results possible are average and sum.

  • When calculating CPU/Memory usage percentage, pay attention to the units for each metric. For example, if the cluster consistently has 8 VCores, the total VCore seconds for each hour would be 8 * 3600 VCore seconds. You can then use this adjusted number to compare with the VCore seconds used by YARN or YARN pools.

Retrieving metric data using the Cloudera Manager API

There is a Time series endpoint exposed by the Cloudera Manager REST API. See Cloudera Manager API documentation timeseries Endpoint. The API accepts tsquery statements as input for which metrics need to be retrieved during the specified time window. The API provides functionality to specify the desired data granularity (for example, raw metric values, TEN_MINUTES, HOURLY etc.). Each granularity level of data is maintained in a leveldb table (see Data Granularity and Time-Series Metric Data). This data is aggregated from raw metric values such as minimum, maximum, etc. within the corresponding data window.

For example, if you do not need the metric data at a specific timestamp but care more about the hourly usage, HOURLY data should be good enough. In general, the longer the granular window it is, the less storage it is taking, and thus the longer period of time you are able to keep that level of data without being purged when the storage hits the configured limit. In the case of Cloudera Manager Cluster Utilization Reports, Cloudera Manager generates the reports based on an hourly window.

To view the Cloudera Manager Service Monitor data storage granularities, go to Clusters > Cloudera Management Service > Service Monitor > Charts Library > Service Monitor Storage and scroll down to see the Data Duration Covered table to see the earliest available data points for each level of granularity. The value in the last(duration_covered) column indicates the age of the oldest data in the table.


To configure the Time series storage used by the Service Monitor, go to Clusters > Cloudera Management Service > Configuration > Charts Library > Service Monitor Storage and search for "Time-Series Storage".

Querying metric data using the Cloudera Manager Admin Console

To build charts that query time series data using the Cloudera Manager Admin console, go to Charts > Chart Builder. When building charts, it may be useful to choose the data granularity by clicking the Show additional options link on the chart builder page and then selecting the Data Granularity drop-down list. See Charting Time-Series Data.

Selecting data granularity in chart builder: