Using MapReduce with HBase

To run MapReduce jobs that use HBase, you need to add the HBase and Zookeeper JAR files to the Hadoop Java classpath. You can do this by adding the following statement to each job:

TableMapReduceUtil.addDependencyJars(job);

This distributes the JAR files to the cluster along with your job and adds them to the job's classpath, so that you do not need to edit the MapReduce configuration.

When getting an Configuration object for a HBase MapReduce job, instantiate it using the HBaseConfiguration.create() method.

HBase Online Merge

Configuring HBase Garbage Collection