Checking and Repairing HBase Tables

HBaseFsck (hbck) is a command-line tool that checks for region consistency and table integrity problems and repairs corruption. It works in two basic modes — a read-only inconsistency identifying mode and a multi-phase read-write repair mode.
  • Read-only inconsistency identification: In this mode, which is the default, a report is generated but no repairs are attempted.
  • Read-write repair mode: In this mode, if errors are found, hbck attempts to repair them.

Always run HBase administrative commands such as the HBase Shell, hbck, or bulk-load commands as the HBase user (typically hbase).

Running hbck Manually

The hbck command is located in the bin directory of the HBase install.
  • With no arguments, hbck checks HBase for inconsistencies and prints OK if no inconsistencies are found, or the number of inconsistencies otherwise.
  • With the -details argument, hbck checks HBase for inconsistencies and prints a detailed report.
  • To limit hbck to only checking specific tables, provide them as a space-separated list: hbck <table1> <table2>
  • If region-level inconsistencies are found, use the -fix argument to direct hbck to try to fix them. The following sequence of steps is followed:
    1. The standard check for inconsistencies is run.
    2. If needed, repairs are made to tables.
    3. If needed, repairs are made to regions. Regions are closed during repair.
  • You can also fix individual region-level inconsistencies separately, rather than fixing them automatically with the -fix argument.
    • -fixAssignments repairs unassigned, incorrectly assigned or multiply assigned regions.
    • -fixMeta removes rows from hbase:meta when their corresponding regions are not present in HDFS and adds new meta rows if regions are present in HDFS but not in hbase:meta.
    • -repairHoles creates HFiles for new empty regions on the filesystem and ensures that the new regions are consistent.
    • -fixHdfsOrphans repairs a region directory that is missing a region metadata file (the .regioninfo file).
    • -fixHdfsOverlaps fixes overlapping regions. You can further tune this argument using the following options:
      • -maxMerge <n> controls the maximum number of regions to merge.
      • -sidelineBigOverlaps attempts to sideline the regions which overlap the largest number of other regions.
      • -maxOverlapsToSideline <n> limits the maximum number of regions to sideline.
  • To try to repair all inconsistencies and corruption at once, use the -repair option, which includes all the region and table consistency options.

For more details about the hbck command, see Appendix C of the HBase Reference Guide.