The Data Readiness Index 2026: Understanding the Foundations for Successful AI

See the results

Overview

This 3-day course covers Apache HBase, a distributed, scalable, NoSQL database designed for real-time read/write access to large datasets. Built on top of HDFS, it brings low-latency random access to Hadoop-scale data. The course includes HBase architecture, data modeling, read/write internals, deployment, high availability, tuning, security, troubleshooting, and advanced topics like Phoenix, HBCK2, and YCSB benchmarking.

Download full course description 

What Skills You Will Gain?

  • Understanding HBase architecture and its role in the Cloudera
  • Operational Database
  • Deploying and configuring HBase clusters for high availability
  • Designing effective HBase schemas for scalable, real-time workloads
  • Analyzing and optimizing HBase write and read paths
  • Tuning HBase performance through memory management, caching,
  • and compaction
  • Securing HBase with authorization policies in Ranger
  • Monitoring and troubleshooting clusters using HBCK2 and diagnostic
  • tools
  • Performing data backup, recovery, and cluster migration
  • Querying HBase tables using Apache Phoenix and its advanced
  • features
  • Benchmarking performance with the YCSB tool
  • Managing medium-sized objects (MOBs) efficiently

Who Should Take This Course?

This course is designed for administrators and data engineers who manage or support Apache HBase deployments in production environments. It is also valuable for DevOps professionals involved in performance tuning, monitoring, and troubleshooting databases. Prior experience with HDFS and ZooKeeper is recommended. Students must have Internet access to connect to the hands-on lab environments.  

Book the course

Course Details

Configuring HBase for High Availability

HBase Schema Design

  • General Design Considerations
  • Application-Centric Design
  • Designing HBase Row Keys
  • Case: Row Key Design

HBase Performance Tuning

  • Performance Evaluation using 'pe'
  • Optimization through Parameters
  • Garbage Collection
  • Case: Parameter Optimization
  • Best Practices

>HBase Migration, Backup, and Recovery

  • Full Migration
  • Incremental Migration
  • Best Practices

Resource Management

  • Managing Roles and Templates
  • Adding Workers
  • Decommissioning and removing workers

YARN Queues and Jobs

  • Install and Configure YARN Queues
  • Running and Managing jobs

Managing Services

  • Identifying and installing Parcels
  • Add and remove Cloudera services

Configuration Management

  • Configuration changes to properties
  • Configuring Role Groups

HBase Monitoring & Troubleshooting

  • Monitoring HBase
  • Testing Network Bandwidth
  • Cloudera Manager Charts
  • Troubleshooting RIT Issues
  • Best Practices

Operational Database

HBase Essentials

  • Overview
  • HBase Table Fundamentals
  • HBase Shell
  • HBase Data Access
  • Column Family Design and Considerations
  • Filtering Scans
  • Best Practices

HBase Write & Read Path

  • HBase Write Path
  • HBase Read Path
  • Deploying and Accessing to HBase Cluster
  • HBase Cluster in Cloudera on Premises
  • HBase Cluster in Cloudera on Cloud

Phoenix Overview

  • Phoenix Overview
  • Command Line Client
  • Metadata SQL Commands and Line Client
  • Creating a Table
  • Table Architecture
  • Modifying and Deleting rows
  • Reading Data
  • Transaction
  • BulkLoad
  • Views
  • Mapping Phoenix table to an existing HBase table
  • Secondary Index
  • Salted Table
  • Phoenix Optimization

Introduction to the HBase HBCK2 Repair Tool

Introduction to BucketCache

Benchmarking HBase Using YCSB Tool

Storing Medium Object (MOBs)

Ready to Get Started?

Your form submission has failed.

This may have been caused by one of the following:

  • Your request timed out
  • A plugin/browser extension blocked the submission. If you have an ad blocking plugin please disable it and close this message to reload the page.