Configuring the HBase BlockCache

HBase provides both on-heap and off-heap BlockCache implementations.
  • On-heap: The default on-heap BlockCache implementation is LruBlockCache. You can optionally use BucketCache on-heap as well as off-heap.
  • Combined: If your operations use more data than will fit into the heap, you can use the BucketCache as a L2 cache for the on-heap LruBlockCache. This implementation is referred to as CombinedBlockCache.

Contents of the BlockCache

In order to configure the BlockCache, you need to understand its contents.
  • Your data: Each time a Get or Scan operation occurs, the result is added to the BlockCache if it was not already there.
  • Row keys: When a value is loaded into the cache, its row key is also cached. This is one reason to make your row keys as small as possible. A larger row key takes up more space in the cache.
  • hbase:meta: The hbase:meta catalog table keeps track of which RegionServer is serving which regions. It can consume several megabytes of cache if you have a large number of regions, and has in-memory access priority, which means HBase attempts to keep it in the cache as long as possible.
  • Indexes of HFiles: HBase stores its data in HDFS in a format called HFile. These HFiles contain indexes which allow HBase to seek for data within them without needing to open the entire HFile. The size of an index is a factor of the block size, the size of your row keys, and the amount of data you are storing. For big data sets, the size can exceed 1 GB per RegionServer, although the entire index is unlikely to be in the cache at the same time. If you use the BucketCache, indexes are always cached on-heap.
  • Bloom filters: If you use Bloom filters, they are stored in the BlockCache.

All of the sizes of these objects are highly dependent on your usage patterns and the characteristics of your data. For this reason, the HBase Web UI and Cloudera Manager each expose several metrics to help you size and tune the BlockCache.

Choosing a BlockCache Configuration

The HBase team has published the results of exhaustive BlockCache testing, which revealed the following guidelines.
  • If the data set fits completely in cache, the default configuration, which uses the on-heap LruBlockCache, is the best choice. If the eviction rate is low, garbage collection is 50% less that of the CombinedBlockCache, and throughput is at least 20% higher.
  • Otherwise, if your cache is experiencing a consistently high eviction rate, use CombinedBlockCache, which causes 30-50% of the garbage collection of LruBlockCache when the eviction rate is high.
  • CombinedBlockCache using file mode on solid-state disks has a better garbage-collection profile but lower throughput than CombinedBlockCache using off-heap mode.

Bypassing the BlockCache

If the data needed for an operation does not all fit in memory, using the BlockCache can be counter-productive, because data that you are still using may be evicted, or even if other data is evicted, excess garbage collection can adversely effect performance. For this type of operation, you may decide to bypass the BlockCache. To bypass the BlockCache for a given Scan or Get, use the setCacheBlocks(false) method.

In addition, you can prevent a specific column family's contents from being cached, by setting its BLOCKCACHE configuration to false. Use the following syntax in HBase Shell:
hbase> alter 'myTable', CONFIGURATION => {NAME => 'myCF', BLOCKCACHE => 'false'}

About the LruBlockCache

The LruBlockCache implementation resides entirely within the Java heap. Fetching from LruBluckCache will always be faster than BucketCache, because no disk seeks are required. However, LruBlockCache is more impacted by garbage-collection and performance can be less predictable over time than BucketCache.

LruBlockCache contains three levels of block priority to allow for scan-resistance and in-memory column families:

  • Single access priority: The first time a block is loaded from HDFS, that block is given single access priority, which means that it will be part of the first group to be considered during evictions. Scanned blocks are more likely to be evicted than blocks that are used more frequently.

  • Multi access priority: If a block in the single access priority group is accessed again, that block is assigned multi access priority, which moves it to the second group considered during evictions, and is therefore less likely to be evicted.

  • In-memory access priority: If the block belongs to a column family which is configured with the in-memory configuration option, its priority is changed to in memory access priority, regardless of its access pattern. This group is the last group considered during evictions, but is not guaranteed not to be evicted. Catalog tables are configured with in-memory access priority.

    To configure a column family for in-memory access, use the following syntax in HBase Shell:
    hbase> alter 'myTable', 'myCF', CONFIGURATION => {IN_MEMORY => 'true'}

    To use the Java API to configure a column family for in-memory access, use the HColumnDescriptor.setInMemory(true) method.

Configuring the LruBlockCache

When you use the LruBlockCache, the blocks needed to satisfy each read are cached, evicting older cached objects if the LruBlockCache is full. The size cached objects for a given read may be significantly larger than the actual result of the read. For instance, if HBase needs to scan through 20 HFile blocks to return a 100 byte result, and the HFile blocksize is 100 KB, the read will add 20 * 100 KB to the LruBlockCache.

Because the LruBlockCache resides entirely within the Java heap, the amount of which is available to HBase and what percentage of the heap is available to the LruBlockCache strongly impact performance. By default, the amount of HBase's heap reserved for the LruBlockCache (hfile.block.cache.size) is .25, or 25%. To determine the amount of memory available to HBase, use the following formula. The 0.99 factor allows 1% of heap to be available for evicting items from the cache.
number of RegionServers * heap size * hfile.block.cache.size * 0.99

To tune the size of the LruBlockCache, you can add RegionServers to increase it, or you can tune hfile.block.cache.size to reduce it. Reducing it will cause cache evictions to happen more often.

About the BucketCache

The BucketCache stores cached objects in different "buckets", based upon the sizes of the objects. By default, the buckets are all the same size, controlled by the configuration property hfile.bucketcache.size. However, you can configure multiple bucket sizes if your data fits into different, predictable size categories, using the hfile.bucketcache.sizes property instead, which takes a comma-separated list of sizes as its value. Evictions are managed independently per bucket.

The physical location of the BucketCache storage can be either in memory (off-heap) or in a file stored in a fast disk.
  • Off-heap: This is the default configuration, where the BucketCache is used as an L2 cache for the on-heap LruBlockCache.
  • File-based: You can use the file-based storage mode to store the BucketCache on an SSD or FusionIO device, giving the LruBlockCache a faster L2 cache to work with.

Configuring the BucketCache

This table summaries the important configuration properties for the BucketCache. To configure the BucketCache, see Configuring the BucketCache Using Cloudera Manager or Configuring the BucketCache Using the Command Line
BucketCache Configuration Properties
Property Default Description
hbase.bucketcache.combinedcache.enabled
true When BucketCache is enabled, use it as a L2 cache for LruBlockCache. If set to true, indexes and Bloom filters are kept in the LruBlockCache and the data blocks are kept in the BucketCache
hbase.bucketcache.ioengine
none (BucketCache is disabled by default) Where to store the contents of the BucketCache. One of: onheap, offheap, or file:/path/to/file.
hfile.block.cache.size
0.4 A float between 0.0 and 1.0. This factor multiplied by the Java heap size is the size of the L1 cache.
hfile.bucketcache.size
0 Either the size of the BucketCache (if expressed as an integer) or the percentage of the total heap to use for the BucketCache, if expressed as a float between 0.0 and 1.0. See hbase.bucketcache.percentage.in.combinedcache. A simplified configuration is planned for HBase 1.0.
hbase.bucketcache.percentage.in.combinedcache
none
In HBase 0.98, this property controls the percentage of the CombinedBlockCache which will be used by the BucketCache. The LruBlockCache L1 size is calculated to be
(1 -
      hbase.bucketcache.percentage.in.combinedcache) * size-of-bucketcache
and the BucketCache size is
hbase.bucketcache.percentage.in.combinedcache *
      size-of-bucket-cache
. where size-of-bucket-cache itself is either the value of the configuration hbase.bucketcache.size if it is specified in megabytes, or
hbase.bucketcache.size * -XX:MaxDirectMemorySize
if hbase.bucketcache.size is between 0 and 1.0.

A simplified configuration is planned for HBase 1.0.

hbase.bucketcache.bucket.sizes
4, 8, 16, 32, 40, 48, 56, 64, 96, 128, 192, 256, 384, 512 KB A comma-separated list of sizes for buckets for the BucketCache if you prefer to use multiple sizes. The sizes should be multiples of the default blocksize, ordered from smallest to largest. The sizes you use will depend on your data patterns. This parameter is experimental.
-XX:MaxDirectMemorySize
MaxDirectMemorySize = BucketCache + 1 A JVM option to configure the maximum amount of direct memory available to the JVM. You do not need to manually configure this parameter. It is automatically calculated and configured based on the following formula: MaxDirectMemorySize = BucketCache + 1

Configuring the BucketCache Using Cloudera Manager

  1. In the Cloudera Manager UI, go to Clusters > Services > HBase. Go to Configuration.
  2. Search for the setting Java Heap Size of HBase RegionServer in Bytes and set a value higher than the desired size of your BucketCache. For instance, if you want a BlockCache size of 4 GB, you might set the heap size to 5 GB. This accommodates a 4 GB BucketCache with 1 GB left over.
  3. Edit the parameter HBASE_OPTS in the HBase Service Advanced Configuration Snippet for hbase-env.sh and add the JVM option -XX:MaxDirectMemorySize=<size>G, replacing <size> with a value large enough to contain your heap and off-heap BucketCache, expressed as a number of gigabytes.
  4. Add the following settings to the HBase Service Advanced Configuration Snippet for hbase-site.xml, using values appropriate to your situation. See BucketCache Configuration Properties. This example configures the L1 cache to use 20% of the heap (5 GB x 20% is 1 GB), land configures the BucketCache to use 4 GB. Because hbase.bucketcache.combinedcache.enabled defaults to true, this configuration uses the CombinedBlockCache.
    <property>
      <name>hbase.bucketcache.ioengine</name>
      <value>offheap</value>
    </property>
    <property>
      <name>hfile.block.cache.size</name>
      <value>0.2</value>
    </property>
    <property>
      <name>hbase.bucketcache.size</name>
      <value>4196</value>
    </property>
  5. Restart or rolling restart your RegionServers for the changes to take effect.

Configuring the BucketCache Using the Command Line

  1. First, configure the off-heap cache size for each RegionServer by editing its hbase-env.sh file and adding a line like the following:
    HBASE_OFFHEAPSIZE=5G

    This value sets the total heap size for a given RegionServer.

  2. Next, configure the properties in BucketCache Configuration Properties as appropriate. The following example configures the L1 cache to use 20% of the heap (5 GB x 20 % is 1 GB), and configures the BucketCache to use 4 GB. Because hbase.bucketcache.combinedcache.enabled defaults to true, this configuration uses the CombinedBlockCache.
    <property>
      <name>hbase.bucketcache.ioengine</name>
      <value>offheap</value>
    </property>
    <property>
      <name>hfile.block.cache.size</name>
      <value>0.2</value>
    </property>
    <property>
      <name>hbase.bucketcache.size</name>
      <value>4196</value>
    </property>
  3. Restart each RegionServer for the changes to take effect.

Monitoring the BlockCache

Cloudera Manager provides metrics to monitor the performance of the BlockCache, to assist you in tuning your configuration. See HBase Metrics.

You can view further detail and graphs using the RegionServer UI. To access the RegionServer UI in Cloudera Manager, go to the Cloudera Manager page for the host, click the RegionServer process, and click HBase RegionServer Web UI.

If you do not use Cloudera Manager, access the BlockCache reports at http://regionServer_host:22102/rs-status#memoryStats, replacing regionServer_host with the hostname or IP address of your RegionServer.