Apache HBase Incompatible Changes and Limitations

Compatibility Notes for CDH 5

This section contains information that is relevant for all releases within the CDH 5 family. See the sections below for information which pertains to specific releases within CDH 5. If you are upgrading through more than one version (for instance, from CDH 5.0 to CDH 5.2), read the sections for each version, as most of the information listed applies to the given version and newer releases.

General Notes

  • Rolling upgrades from CDH 4 to CDH 5 are not possible because existing CDH 4 HBase clients cannot make requests to CDH 5 servers and CDH 5 HBase clients cannot make requests to CDH 4 servers. Replication between CDH 4 and CDH 5 is not currently supported. Exposed JMX metrics in CDH 4 have been refactored and some have been removed.
  • The upgrade from CDH 4 HBase to CDH 5 HBase is irreversible and requires HBase to be shutdown completely.
  • As of CDH4.2, the default Split Policy changed from ConstantSizeRegionSplitPolicy to IncreasingToUpperBoundRegionSplitPolicy (ITUBRSP). This affects upgrades from CDH 4.1 or earlier to CDH 5.
  • FilterBase no longer implements Writable. This means that you do not need to implement readFields() and write() methods when writing your own custom fields. Instead, put this logic into the toByteArray and parseFrom methods. See this page for an example.
  • The default number of retained cell versions is reduced from 3 to 1. To increase the number of versions, you can specify the VERSIONS option at table creation or by altering existing tables. Starting with CDH 5.2, you can specify a global default number of versions, which will be applied to all newly created tables where the number of versions is not otherwise specified, by setting hbase.column.max.version to the desired number of versions in hbase-site.xml.
  • In CDH 5 prior to 5.1.3, a Put submitted with a KeyValue, KeyValue.Type.Delete does not delete the cell. This is different from the behavior in CDH 4. In CDH 5.1.3, this behavior is changed, so that a Put submitted with a KeyValue, KeyValue.Type.Delete does delete the cell. This fix is provided in HBASE-11788.
Developer API Changes
  • The set of exposed APIs has been solidified. If you are using APIs outside of the user API, we cannot guarantee compatibility with future minor versions.

  • CDH 5 introduces a new layout for HBase build artifacts and requires POM changes if you use Maven, or JAR changes otherwise.

    Previously, in CDH 4 you only needed to add a dependency for the HBase JAR:
    <dependency>
      <groupId> org.apache.hbase </groupId> 
      <artifactId> hbase </artifactId> 
      <optional> true </optional> 
      <scope>provided</scope>
    </dependency>
    Now, when building against CDH 5 you will need to add a dependency for the hbase-client JAR. The hbase module continues to exist as a convenient top-level wrapper for existing clients, and it pulls in all the sub-modules automatically. But it is only a simple wrapper, so its repository directory will carry no actual jars.
    <dependency>
      <groupId>org.apache.hbase</groupId> 
      <artifactId>hbase-client</artifactId> 
      <version>${hbase.version}</version> 
      <scope>provided</scope>
    </dependency>
    If your code uses the HBase minicluster, you can pull in the hbase-testing-util dependency:
    <dependency>
      <groupId>org.apache.hbase</groupId> 
      <artifactId>hbase-testing-util</artifactId> 
      <version>${cdh.hbase.version}</version> 
      <scope>provided</scope>
    </dependency>

    If you need to obtain all HBase JARs required to build a project, copy them from the CDH installation directory (typically /usr/lib/hbase for an RPM install, or /opt/cloudera/parcels/CDH/lib/hbase if you install using Parcels), or from the CDH 5 HBase tarballs. However, for building client applications, Cloudera recommends using build tools such as Maven, rather than manually referencing JARs.

  • CDH 5 introduces support for addressing cells with an empty column qualifier (a string of 0 bytes in length), but not all edge services handle that scenario correctly. In some cases, attempting to address a cell at [ rowkey, fam ] results in interaction with the entire column family, rather than the empty column qualifier.

    Users of the HBase Shell, MapReduce, REST, and Thrift must use family instead of family: (notice the omitted ":"), to interact with an entire column family, rather than an empty column qualifier. Including the ":" will be interpreted as an interaction with the empty qualifier in the family column family.

  • API Removals
Operator API Changes
  • Many of the default configurations from CDH 4 in hbase-default.xml have been changed to new values in CDH 5. See HBASE-8450 for a complete list of changes.
  • HBASE-6553 - Removed Avro Gateway. This feature was less robust and not used as much as the Thrift gateways. It has been removed upstream.
  • HBase provides a metrics framework based on JMX beans. Between HBase 0.94 and 0.96, the metrics framework underwent many changes. Some beans were added and removed, some metrics were moved from one bean to another, and some metrics were renamed or removed. Click here to download the CSV spreadsheet which provides a mapping.

User API Changes
  • The HBase User API (Get, Put, Result, Scanner etc; see Apache HBase API documentation) has evolved and attempts have been made to make sure the HBase Clients are source code compatible and thus should recompile without needing any source code modifications. This cannot be guaranteed however, since with the conversion to ProtoBufs, some relatively obscure APIs have been removed. Rudimentary efforts have also been made to preserve recompile compatibility with advanced APIs such as Filters and Coprocessors. These advanced APIs are still evolving and our guarantees for API compatibility are weaker here.
  • As of 0.96, the User API has been marked and all attempts at compatibility in future versions will be made. A version of the javadoc that only contains the User API can be found here.
  • Other changes to CDH 5 HBase that require the upgrade include:
    • HBASE-8015: The HBase Namespaces feature has changed HBase HDFS file layout.
    • HBASE-4451: Renamed ZooKeeper nodes.
    • HBASE-3171: The META table in CDH 4 has been renamed to be hbase:meta. Similarly the ACL table has been renamed to hbase:acl. The .ROOT table has been removed.
    • HBASE-8352: HBase snapshots are now saved to the /<hbase>/.hbase-snapshot dir instead of the /.snapshot dir. This should be handled before upgrading HDFS.
    • HBASE-7660: Removed support for HFile V1. All internal HBase files in the HFile v1 format must be converted to the HFile v2 format.
    • HBASE-6170/HBASE-8909 - The hbase.regionserver.lease.period configuration parameter has been deprecated. Use hbase.client.scanner.timeout.period instead.
  • The behavior of the filter MUST_PASS_ALL changed between CDH 4 and CDH 5. In CDH 4, a FilterList with the default MUST_PASS_ALL operator return all rows (not filtering the results). In CDH 5, no results are returned when the FilterList is empty with the MUST_PASS_ALL operator. To continue using the CDH 4 behavior, modify your code to use the scan.setLoadColumnFamiliesOnDemand(false); method.

Compatibility Notes for CDH 5.9

  • The default RPC scheduler has been changed from 'deadline' to 'fifo'. To reenable 'deadline', set hbase.ipc.server.callqueue.type to deadline in the hbase-site.xml file.
  • Apache HBase no longer includes XSS defense or encoding for filters. Due to licensing issues, HBase no longer includes a prior XSS defense nor an encoding for filters. Additionally, several dependencies have been removed. Downstream users relying on transitive inclusion of the following will need to directly rely on the appropriate dependency themselves: jsr305 (from the FindBugs project), Apache Commons Fileupload, nekohtml, beanshell core, Apache xml graphics, OWASP antisamy, OWASP esapi, Xalan, Apache Xerces, and Xom.

Compatibility Notes for CDH 5.8

  • HBase now ensures the jsr305 implementation from the findbugs project is not included in its binary artifacts or the compile / runtime dependencies of its user facing modules. Downstream users that rely on this jar will need to update their dependencies.
  • HBase no longer includes Xerces implementation jars that were previously included via transitive dependencies. Downstream users relying on HBase for these artifacts will need to update their dependencies.
  • This issue reverts fixes designed to prevent malicious content from rendering in HBase's UIs. Specifically, these changes shipped in 1.1.4+ and 1.2.0+. They were removed due to licensing issues discovered in the dependencies they introduced. Their implementation and those dependencies have been removed from HBase! Removal of these dependencies is against the strict definition of our version compatibility guidelines. However, inclusion of non-Apache approved licenses cannot be tolerated. Implementation of these fixes using an Apache-appropriate means is tracked in HBASE-16328.

Compatibility Notes for CDH 5.7

  • Cloudera recommends not using the new advanced configuration option hbase.regionserver.hostname, added in HBase 1.2 (CDH 5.7.0), which allows you to specify a separate external-facing hostname for a RegionServer.

Compatibility Notes for CDH 5.4

  • The ports used by Apache HBase 1.0 changed from the 600XX range to the 160XX range. HBase in CDH reverted the change, and continues to use the 600XX port range, to maintain compatibility.
  • If you used visibility labels prior to CDH 5.4 and assigned superuser privileges to HBase users by adding the system label to their set of labels, these users will no longer be superusers in CDH 5.4. To be sure that cached credentials are cleared, use the HBase Shell command clear_auths <username>, for each affected user. To grant users superuser privileges, add them to the HBase Superusers group in Cloudera Manager, or add them to the hbase.superuser property in hbase-site.xml, and restart the HMaster.
  • HTrace is experimental in CDH 5.4.0. Artifacts and package names cannot be relied upon.
  • Jersey was updated from 1.8 to 1.9. This has the following implications.
    • The Jersey version is now consistent with Apache HBase and other CDH components.
    • If your project relies upon jersey-server, you may need to make modifications.
  • Curator in Hadoop was updated from 2.6.0 to 2.7.1. This has the following implications for HBase.
    • PathUtils.validatePath(String) changed return types, which will cause runtime errors for code compiled against the older version.
    • The SharedCountReader and SharedValueReader interfaces each added a method, which will cause compilation errors for code made to use the old version.
  • commons-codec was upgraded from 1.7 to 1.9. This has the following implications for HBase.
    • The class org.apache.commons.codec.net.QuotedPrintableCodec has a constructor that throws additional exceptions. See the API reference for details.
  • commons-logging was updated from version 1.1.1. to 1.2. This has the following implications for HBase.
    • org.apache.commons.logging.LogSource.setLogImplementation(String) no longer throws ExceptionInInitializerError, which may change behavior of code that expects it.
  • API changes: see New Features and Changes for HBase in CDH 5. CDH reverted API changes in HBase 1.0 which broke compatibility with HBase in CDH 5.0, 5.1, 5.2, and 5.3. If you have written applications using Apache HBase 1.0 APIs, you may need to modify these applications to run in CDH 5.4.
Differences between CDH 5.4 HBase 1.0 and Apache HBase 1.0:
  • CDH 5.4.0 keeps commons-math at version 2.1 to maintain compatibility with earlier CDH releases, whereas Apache HBase 1.0 uses commons-math 2.2.
  • CDH 5.4.0 keeps Netty at version 3 to maintain compatibility with earlier CDH releases, whereas Apache HBase 1.0 uses Netty 4.

Compatibility Notes for CDH 5.3

  • The Put class no longer implements Writable. Instead, you can change the definition to org.apache.hadoop.mapreduce.TaskInputOutputContext<org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.client.Result,org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.client.Put> if you have only Puts, or org.apache.hadoop.mapreduce.TaskInputOutputContext<org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.client.Result,org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.client.Mutation> if you have a mix of Puts, Gets, and Deletes.

Compatibility Notes for CDH 5.2

  • In HBase in CDH 5.1, the default value for hbase.security.access.early_out was set to false. In CDH 5.2, the default value has been changed to true, to maintain consistency with the behavior in CDH 4. When set to true, if a user is not granted access to a column family qualifier, the AccessController immediately throws an AccessDeniedException. This change to the default behavior will affect users who enabled HFile version 3 and the AccessController coprocessor in CDH 5.1, and then upgrade to CDH 5.2. In this case, if you prefer hbase.security.access.early_out to be disabled, explicitly set it to false in hbase-site.xml.
  • Starting with CDH 5.2, you can specify a global default number of versions, which will be applied to all newly created tables where the number of versions is not otherwise specified, by setting hbase.column.max.version to the desired number of versions in hbase-site.xml.
  • HBase in CDH 5.2 differs from Apache HBase 0.98.6 in that CDH does not include HBASE-11546, which provides ZooKeeper-less region assignment. CDH omits this feature because it is an incompatible change that prevents an upgraded cluster from being rolled back to a previous version.
Developer Interface Changes
  • HBase 0.98.5 removed ClientSmallScanner from the public API. HBase in CDH 5.2 restores the constructor to maintain backward compatibility, but in future releases of HBase, this class will no longer be public. You should change your code to use the Scan.setSmall(true) method instead.

Compatibility Notes for CDH 5.1

General Notes

  • HBASE-8218 changes AggregationClient by replacing the byte[] tablename parameters with HTable table. This means that coprocessors compiled against CDH 5.0.x won't run or compile in CDH 5.1 and later.
  • In CDH 5.1 and later, delete* methods of the Delete class of the HBase Client API use the timestamp from the constructor, the same behavior as the Put class. (In previous versions, the delete* methods ignored the constructor's timestamp, and used the value of HConstants.LATEST_TIMESTAMP. This behavior was different from the behavior of the add() methods of the Put class.) See HBASE-10964.
  • In CDH 5 prior to 5.1.3, a Put submitted with a KeyValue, KeyValue.Type.Delete does not delete the cell. This is different from the behavior in CDH 4. In CDH 5.1.3, this behavior is changed, so that a Put submitted with a KeyValue, KeyValue.Type.Delete does delete the cell. This fix is provided in HBASE-11788.
  • In CDH 5.1 and newer, HBase introduces a new snapshot format (HBASE-7987). A snapshot created in HBase 0.98 cannot be read by HBase 0.96. HBase 0.98 can read snapshots produced in previous versions of HBase, and no conversion is necessary.
  • In CDH 5.1, the default value for hbase.security.access.early_out was changed from true to false. A setting of true means that if a user is not granted access to a column family qualifier, the AccessController immediately throws an AccessDeniedException. This behavior change was reverted for CDH 5.2.
Developer Interface Changes
  • HTablePool is no longer supported in CDH 5.1 and later. The HConnection object is the replacement. You create the connection once and pass it around, as with the old table pool.
    HConnection connection = HConnectionManager.createConnection(config);
    HTableInterface table = connection.getTable(tableName);
    table.put(put);
    table.close();
    connection.close();
    You can set the hbase.hconnection.threads.max property in hbase-site.xml to control the pool size or you can pass an ExecutorService to HConnectionManager.createConnection().
    ExecutorService pool = ...;
    HConnection connection = HConnectionManager.createConnection(conf, pool);

Compatibility Notes for CDH 5 Beta Releases

The HBase client from CDH 5 Beta 1 is not wire compatible with CDH 5 Beta 2 because of changes introduced in HBASE-9612. As a consequence, CDH 5 Beta 1 users will not be able to execute a rolling upgrade to CDH 5 Beta 2 (or later). This patch unifies the way the HBase clients make requests and simplifies the internals, but breaks wire compatibility. Developers may need to recompile applications built upon the CDH 5 Beta 1 API.

As of CDH 5 Beta 1 (HBase 0.95), the value of hbase.regionserver.checksum.verify defaults to true; in earlier releases the default is false. For more information, see Checksums in the HBase section of the CDH 5 Installation Guide .

Compatibility between CDH Beta and Apache HBase Releases

  • Apache HBase 0.95.2 is not wire compatible with CDH 5 Beta 1 HBase 0.95.2.
  • Apache HBase 0.96.x should be wire compatible with CDH 5 Beta 2 HBase 0.96.1.1.