Configuring the Hive Metastore to Use HDFS High Availability in CDH

To configure other CDH components to use HDFS high availability, see Configuring Other CDH Components to Use HDFS HA.

Configuring the Hive Metastore to Use HDFS HA

The Hive metastore can be configured to use HDFS high availability by using Cloudera Manager or by using the command-line for unmanaged clusters.

Configuring the Hive Metastore to Use HDFS HA Using Cloudera Manager

  1. In the Cloudera Manager Admin Console, go to the Hive service.
  2. Select Actions > Stop. Click Stop again to confirm the command.
  3. Back up the Hive metastore database.
  4. Select Actions > Update Hive Metastore NameNodes and confirm the command.
  5. Select Actions > Start and click Start to confirm the command.
  6. Restart the Hue and Impala services if you stopped them prior to updating the metastore.

Upgrading the Hive Metastore to Use HDFS HA Using the Command Line

To configure the Hive metastore to use HDFS HA, change the records to reflect the location specified in the dfs.nameservices property, using the Hive metatool to obtain and change the locations.

If you are unsure which version of Avro SerDe is used, use both the serdePropKey and tablePropKey arguments. For example:

$ hive --service metatool -listFSRoot
...
hdfs://<oldnamenode>.com/user/hive/warehouse

$ hive --service metatool -updateLocation hdfs://<new_nameservice1>
hdfs://<oldnamenode>.com -tablePropKey <avro.schema.url> 
-serdePropKey <schema.url>
...

$ hive --service metatool -listFSRoot
...
hdfs://nameservice1/user/hive/warehouse

where:

  • hdfs://oldnamenode.com/user/hive/warehouse identifies the NameNode location.
  • hdfs://nameservice1 specifies the new location and should match the value of the dfs.nameservices property.
  • tablePropKey is a table property key whose value field may reference the HDFS NameNode location and hence may require an update. To update the Avro SerDe schema URL, specify avro.schema.url for this argument.
  • serdePropKey is a SerDe property key whose value field may reference the HDFS NameNode location and hence may require an update. To update the Haivvero schema URL, specify schema.url for this argument.