Kudu Replica Metrics

In addition to these base metrics, many aggregate metrics are available. If an entity type has parents defined, you can formulate all possible aggregate metrics using the formula base_metric_across_parents.

In addition, metrics for aggregate totals can be formed by adding the prefix total_ to the front of the metric name.

Use the type-ahead feature in the Cloudera Manager chart browser to find the exact aggregate metric name, in case the plural form does not end in "s".

For example, the following metric names may be valid for Kudu Replica:

  • kudu_all_transactions_inflight_across_clusters
  • total_kudu_all_transactions_inflight_across_clusters

Some metrics, such as alerts_rate, apply to nearly every metric context. Others only apply to a certain service or role.

For more information about metrics, see Cloudera Manager Metrics and Metric Aggregation.

Metric Name Description Unit Parents CDH Version
kudu_all_transactions_inflight Number of transactions currently in-flight, including any type. transactions cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_alter_schema_transactions_inflight Number of alter schema transactions currently in-flight transactions cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_bloom_lookups_per_op_rate Tracks the number of bloom filter lookups performed by each operation. A single operation may perform several bloom filter lookups if the tablet is not fully compacted. High frequency of high values may indicate that compaction is falling behind. This is the total number of recorded samples. samples per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_bloom_lookups_per_op_sum_rate Tracks the number of bloom filter lookups performed by each operation. A single operation may perform several bloom filter lookups if the tablet is not fully compacted. High frequency of high values may indicate that compaction is falling behind. This is the total sum of recorded samples. message.units.probes per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_bloom_lookups_rate Number of times a bloom filter was consulted message.units.probes per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_bytes_flushed_rate Amount of data that has been flushed to disk by this tablet. bytes per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_commit_wait_duration_rate Time spent waiting for COMMIT_WAIT external consistency writes for this tablet. This is the total number of recorded samples. samples per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_commit_wait_duration_sum_rate Time spent waiting for COMMIT_WAIT external consistency writes for this tablet. This is the total sum of recorded samples. message.units.microseconds per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_compact_rs_duration_rate Time spent compacting RowSets. This is the total number of recorded samples. samples per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_compact_rs_duration_sum_rate Time spent compacting RowSets. This is the total sum of recorded samples. message.units.milliseconds per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_compact_rs_running Number of RowSet compactions currently running. operations cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_delta_file_lookups_per_op_rate Tracks the number of delta file lookups performed by each operation. A single operation may perform several delta file lookups if the tablet is not fully compacted. High frequency of high values may indicate that compaction is falling behind. This is the total number of recorded samples. samples per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_delta_file_lookups_per_op_sum_rate Tracks the number of delta file lookups performed by each operation. A single operation may perform several delta file lookups if the tablet is not fully compacted. High frequency of high values may indicate that compaction is falling behind. This is the total sum of recorded samples. message.units.probes per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_delta_file_lookups_rate Number of times a delta file was consulted message.units.probes per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_delta_major_compact_rs_duration_rate Seconds spent major delta compacting. This is the total number of recorded samples. samples per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_delta_major_compact_rs_duration_sum_rate Seconds spent major delta compacting. This is the total sum of recorded samples. seconds per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_delta_major_compact_rs_running Number of delta major compactions currently running. operations cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_delta_minor_compact_rs_duration_rate Time spent minor delta compacting. This is the total number of recorded samples. samples per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_delta_minor_compact_rs_duration_sum_rate Time spent minor delta compacting. This is the total sum of recorded samples. message.units.milliseconds per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_delta_minor_compact_rs_running Number of delta minor compactions currently running. operations cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_flush_dms_duration_rate Time spent flushing DeltaMemStores. This is the total number of recorded samples. samples per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_flush_dms_duration_sum_rate Time spent flushing DeltaMemStores. This is the total sum of recorded samples. message.units.milliseconds per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_flush_dms_running Number of delta memstore flushes currently running. operations cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_flush_mrs_duration_rate Time spent flushing MemRowSets. This is the total number of recorded samples. samples per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_flush_mrs_duration_sum_rate Time spent flushing MemRowSets. This is the total sum of recorded samples. message.units.milliseconds per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_flush_mrs_running Number of MemRowSet flushes currently running. operations cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_follower_memory_pressure_rejections_rate Number of RPC requests rejected due to memory pressure while FOLLOWER. requests per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_in_progress_ops Number of operations in the peer's queue ack'd by a minority of peers. operations cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_insertions_failed_dup_key_rate Number of inserts which failed because the key already existed message.units.rows per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_key_file_lookups_per_op_rate Tracks the number of key file lookups performed by each operation. A single operation may perform several key file lookups if the tablet is not fully compacted and if bloom filters are not effectively culling lookups. This is the total number of recorded samples. samples per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_key_file_lookups_per_op_sum_rate Tracks the number of key file lookups performed by each operation. A single operation may perform several key file lookups if the tablet is not fully compacted and if bloom filters are not effectively culling lookups. This is the total sum of recorded samples. message.units.probes per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_key_file_lookups_rate Number of times a key cfile was consulted message.units.probes per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_leader_memory_pressure_rejections_rate Number of RPC requests rejected due to memory pressure while LEADER. requests per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_log_append_latency_rate Microseconds spent on appending to the log segment file. This is the total number of recorded samples. samples per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_log_append_latency_sum_rate Microseconds spent on appending to the log segment file. This is the total sum of recorded samples. message.units.microseconds per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_log_bytes_logged_rate Number of bytes logged since service start bytes per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_log_cache_num_ops Number of operations in the log cache. operations cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_log_cache_size Amount of memory in use for caching the local log. bytes cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_log_entry_batches_per_group_rate Number of log entry batches in a group commit group. This is the total number of recorded samples. samples per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_log_entry_batches_per_group_sum_rate Number of log entry batches in a group commit group. This is the total sum of recorded samples. requests per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_log_gc_duration_rate Time spent garbage collecting the logs. This is the total number of recorded samples. samples per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_log_gc_duration_sum_rate Time spent garbage collecting the logs. This is the total sum of recorded samples. message.units.milliseconds per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_log_gc_running Number of log GC operations currently running. operations cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_log_group_commit_latency_rate Microseconds spent on committing an entire group. This is the total number of recorded samples. samples per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_log_group_commit_latency_sum_rate Microseconds spent on committing an entire group. This is the total sum of recorded samples. message.units.microseconds per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_log_reader_bytes_read_rate Data read from the WAL since tablet start bytes per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_log_reader_entries_read_rate Number of entries read from the WAL since tablet start entries per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_log_reader_read_batch_latency_rate Microseconds spent reading log entry batches. This is the total number of recorded samples. samples per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_log_reader_read_batch_latency_sum_rate Microseconds spent reading log entry batches. This is the total sum of recorded samples. bytes per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_log_roll_latency_rate Microseconds spent on rolling over to a new log segment file. This is the total number of recorded samples. samples per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_log_roll_latency_sum_rate Microseconds spent on rolling over to a new log segment file. This is the total sum of recorded samples. message.units.microseconds per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_log_sync_latency_rate Microseconds spent on synchronizing the log segment file. This is the total number of recorded samples. samples per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_log_sync_latency_sum_rate Microseconds spent on synchronizing the log segment file. This is the total sum of recorded samples. message.units.microseconds per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_majority_done_ops Number of operations in the leader queue ack'd by a majority but not all peers. This metric is always zero for followers. operations cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_memrowset_size Size of this tablet's memrowset bytes cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_mrs_lookups_rate Number of times a MemRowSet was consulted. message.units.probes per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_on_disk_data_size Space used by this tablet's data blocks. bytes cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_on_disk_size Size of this tablet on disk. bytes cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_op_prepare_queue_length_rate Number of operations waiting to be prepared within this tablet. High queue lengths indicate that the server is unable to process operations as fast as they are being written to the WAL. This is the total number of recorded samples. samples per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_op_prepare_queue_length_sum_rate Number of operations waiting to be prepared within this tablet. High queue lengths indicate that the server is unable to process operations as fast as they are being written to the WAL. This is the total sum of recorded samples. tasks per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_op_prepare_queue_time_rate Time that operations spent waiting in the prepare queue before being processed. High queue times indicate that the server is unable to process operations as fast as they are being written to the WAL. This is the total number of recorded samples. samples per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_op_prepare_queue_time_sum_rate Time that operations spent waiting in the prepare queue before being processed. High queue times indicate that the server is unable to process operations as fast as they are being written to the WAL. This is the total sum of recorded samples. message.units.microseconds per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_op_prepare_run_time_rate Time that operations spent being prepared in the tablet. High values may indicate that the server is under-provisioned or that operations are experiencing high contention with one another for locks. This is the total number of recorded samples. samples per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_op_prepare_run_time_sum_rate Time that operations spent being prepared in the tablet. High values may indicate that the server is under-provisioned or that operations are experiencing high contention with one another for locks. This is the total sum of recorded samples. message.units.microseconds per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_ops_behind_leader Number of operations this server believes it is behind the leader. operations cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_raft_term Current Term of the Raft Consensus algorithm. This number increments each time a leader election is started. message.units.units cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_rows_deleted_rate Number of row delete operations performed on this tablet since service start message.units.rows per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_rows_inserted_rate Number of rows inserted into this tablet since service start message.units.rows per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_rows_updated_rate Number of row update operations performed on this tablet since service start message.units.rows per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_rows_upserted_rate Number of rows upserted into this tablet since service start message.units.rows per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_scanner_bytes_returned_rate Number of bytes returned by scanners to clients. This count is measured after predicates are applied and the data is decoded for consumption by clients, and thus is not a reflection of the amount of work being done by scanners. bytes per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_scanner_bytes_scanned_from_disk_rate Number of bytes read by scan requests. This is measured as a raw count prior to application of predicates, deleted data,or MVCC-based filtering. Thus, this is a better measure of actual IO that has been caused by scan operations compared to the Scanner Bytes Returned metric. Note that this only counts data that has been flushed to disk, and does not include data read from in-memory stores. However, itincludes both cache misses and cache hits. bytes per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_scanner_cells_returned_rate Number of table cells returned by scanners to clients. This count is measured after predicates are applied, and thus is not a reflection of the amount of work being done by scanners. message.units.cells per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_scanner_cells_scanned_from_disk_rate Number of table cells processed by scan requests. This is measured as a raw count prior to application of predicates, deleted data,or MVCC-based filtering. Thus, this is a better measure of actual table cells that have been processed by scan operations compared to the Scanner Cells Returned metric. Note that this only counts data that has been flushed to disk, and does not include data read from in-memory stores. However, itincludes both cache misses and cache hits. message.units.cells per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_scanner_rows_returned_rate Number of rows returned by scanners to clients. This count is measured after predicates are applied, and thus is not a reflection of the amount of work being done by scanners. message.units.rows per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_scanner_rows_scanned_rate Number of rows processed by scan requests. This is measured as a raw count prior to application of predicates, deleted data,or MVCC-based filtering. Thus, this is a better measure of actual table rows that have been processed by scan operations compared to the Scanner Rows Returned metric. message.units.rows per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_scans_started_rate Number of scanners which have been started on this tablet message.units.scanners per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_snapshot_read_inflight_wait_duration_rate Time spent waiting for in-flight writes to complete for READ_AT_SNAPSHOT scans. This is the total number of recorded samples. samples per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_snapshot_read_inflight_wait_duration_sum_rate Time spent waiting for in-flight writes to complete for READ_AT_SNAPSHOT scans. This is the total sum of recorded samples. message.units.microseconds per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_state State of this tablet. message.units.state cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_tablet_active_scanners Number of scanners that are currently active on this tablet message.units.scanners cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_transaction_memory_pressure_rejections_rate Number of transactions rejected because the tablet's transaction memory limit was reached. transactions per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_undo_delta_block_estimated_retained_bytes Estimated bytes of deletable data in undo delta blocks for this tablet. May be an overestimate. bytes cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_undo_delta_block_gc_bytes_deleted_rate Number of bytes deleted by garbage-collecting old UNDO delta blocks on this tablet since this server was restarted. Does not include bytes garbage collected during compactions. bytes per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_undo_delta_block_gc_delete_duration_rate Time spent deleting ancient UNDO delta blocks. This is the total number of recorded samples. samples per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_undo_delta_block_gc_delete_duration_sum_rate Time spent deleting ancient UNDO delta blocks. This is the total sum of recorded samples. message.units.milliseconds per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_undo_delta_block_gc_init_duration_rate Time spent initializing ancient UNDO delta blocks. This is the total number of recorded samples. samples per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_undo_delta_block_gc_init_duration_sum_rate Time spent initializing ancient UNDO delta blocks. This is the total sum of recorded samples. message.units.milliseconds per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_undo_delta_block_gc_perform_duration_rate Time spent running the maintenance operation to GC ancient UNDO delta blocks. This is the total number of recorded samples. samples per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_undo_delta_block_gc_perform_duration_sum_rate Time spent running the maintenance operation to GC ancient UNDO delta blocks. This is the total sum of recorded samples. message.units.milliseconds per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_undo_delta_block_gc_running Number of UNDO delta block GC operations currently running. operations cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_upserts_as_updates_rate Number of upserts which were applied as updates because the key already existed. message.units.rows per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_write_op_duration_client_propagated_consistency_rate Duration of writes to this tablet with external consistency set to CLIENT_PROPAGATED. This is the total number of recorded samples. samples per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_write_op_duration_client_propagated_consistency_sum_rate Duration of writes to this tablet with external consistency set to CLIENT_PROPAGATED. This is the total sum of recorded samples. message.units.microseconds per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_write_op_duration_commit_wait_consistency_rate Duration of writes to this tablet with external consistency set to COMMIT_WAIT. This is the total number of recorded samples. samples per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_write_op_duration_commit_wait_consistency_sum_rate Duration of writes to this tablet with external consistency set to COMMIT_WAIT. This is the total sum of recorded samples. message.units.microseconds per second cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5
kudu_write_transactions_inflight Number of write transactions currently in-flight transactions cluster, kudu, kudu-kudu_tserver, kudu_table, kudu_tablet, rack CDH 5