Limiting the Speed of Compactions

You can limit the speed at which HBase compactions run, by configuring hbase.regionserver.throughput.controller and its related settings. The default controller is org.apache.hadoop.hbase.regionserver.compactions.PressureAwareCompactionThroughputController, which uses the following algorithm:
  1. If compaction pressure is greater than 1.0, there is no speed limitation.
  2. In off-peak hours, use a fixed throughput limitation, configured using hbase.hstore.compaction.throughput.offpeak, hbase.offpeak.start.hour, and hbase.offpeak.end.hour.
  3. In normal hours, the max throughput is tuned between hbase.hstore.compaction.throughput.higher.bound and hbase.hstore.compaction.throughput.lower.bound (which default to 20 MB/sec and 10 MB/sec respectively), using the following formula, where compactionPressure is between 0.0 and 1.0. The compactionPressure refers to the number of store files that require compaction.
    lower + (higer - lower) * compactionPressure

To disable compaction speed limits, set hbase.regionserver.throughput.controller to org.apache.hadoop.hbase.regionserver.compactions.NoLimitCompactionThroughputController.

Configure the Compaction Speed Using Cloudera Manager

Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)

  1. Go to the HBase service.
  2. Click the Configuration tab.
  3. Select HBase or HBase Service-Wide.
  4. Search for HBase Service Advanced Configuration Snippet (Safety Valve) for hbase-site.xml. Paste the relevant properties into the field and modify the values as needed. See Configure the Compaction Speed Using the Command Line for an explanation of the properties.
  5. Click Save Changes to commit the changes.
  6. Restart the role.
  7. Restart the service.

Configure the Compaction Speed Using the Command Line

  1. Edit hbase-site.xml and add the relevant properties, modifying the values as needed. Default values are shown. hbase.offpeak.start.hour and hbase.offpeak.end.hour have no default values; this configuration sets the off-peak hours from 20:00 (8 PM) to 6:00 (6 AM).
    <property>
      <name>hbase.regionserver.throughput.controller</name>
      <value>org.apache.hadoop.hbase.regionserver.compactions.PressureAwareCompactionThroughputController</value>
    </property>
    <property>
      <name>hbase.hstore.compaction.throughput.higher.bound</name>
      <value>20971520</value>
      <description>The default is 20 MB/sec</description>
    </property>
    <property>
      <name>hbase.hstore.compaction.throughput.lower.bound</name>
      <value>10485760</value>
      <description>The default is 10 MB/sec</description>
    </property>
    <property>
      <name>hbase.hstore.compaction.throughput.offpeak</name>
      <value>9223372036854775807</value>
      <description>The default is Long.MAX_VALUE, which effectively means no limitation</description>
    </property>
    <property>
      <name>hbase.offpeak.start.hour</name>
      <value>20</value>
      <value>When to begin using off-peak compaction settings, expressed as an integer between 0 and 23.</value>
    </property>
    <property>
      <name>hbase.offpeak.end.hour</name>
      <value>6</value>
      <value>When to stop using off-peak compaction settings, expressed as an integer between 0 and 23.</value>
    </property>
    <property>
      <name>hbase.hstore.compaction.throughput.tune.period</name>
      <value>60000</value>
      <description>
    </property>
  2. Distribute the modified hbase-site.xml to all your cluster hosts and restart the HBase master and RegionServer processes for the change to take effect.