HCatalog Installation

As of CDH 5, HCatalog is part of Apache Hive.

HCatalog is a table and storage management layer for Hadoop that makes the same table information available to Hive, Pig, MapReduce, and Sqoop. Table definitions are maintained in the Hive metastore, which HCatalog requires. WebHCat allows you to access HCatalog using an HTTP (REST style) interface.

This page explains how to install and configure HCatalog and WebHCat. For Sqoop, see Sqoop-HCatalog Integration in the Sqoop User Guide.

Configuring HCatalog Using Cloudera Manager

Minimum Required Role: Cluster Administrator (also provided by Full Administrator)

  1. Go to the Hive service by clicking Clusters > Hive.
  2. Select the Hive Instances tab.
  3. Add a WebHCat server role:
    1. Click Add Role Instances.
    2. Click Select hosts under WebHCat Server.
    3. Select the host on which you want the WebHCat server; this adds a WHCS icon.
    4. Click OK.
  4. Click Continue.
  5. Start the new role type.
    1. Select the new role type, WebHCat Server.
    2. Select Actions for Selected > Start.
    3. Click Start and Close.

Configuring HCatalog Using the Command Line