HiveServer2 High Availability

To enable high availability for multiple HiveServer2 hosts, configure a load balancer to manage them. To increase stability and security, configure the load balancer on a proxy server.

Enabling HiveServer2 High Availability Using Cloudera Manager

Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)

  1. Go to the Hive service.
  2. Click the Configuration tab.
  3. Select Scope > HiveServer2.
  4. Select Category > Main.
  5. Locate the HiveServer2 Load Balancer property or search for it by typing its name in the Search box.
  6. Enter values for hostname:port number.
  7. Click Save Changes to commit the changes.
  8. Restart the Hive service.

Configuring HiveServer2 to Load Balance Behind a Proxy

For clusters with multiple users and availability requirements, you can configure a proxy server to relay requests to and from each HiveServer2 host. Applications connect to a single well-known host and port, and connection requests to the proxy succeed even when hosts running HiveServer2 become unavailable.

  1. Download load-balancing proxy software of your choice on a single host.
  2. Configure the software, typically by editing a configuration file:
    1. Set the port for the load balancer to listen on and relay HiveServer2 requests back and forth.
    2. Set the port and hostname for each HiveServer2 host—that is, the hosts from which the load balancer chooses when relaying each query.
  3. Run the load-balancing proxy server and point it at the configuration file.
  4. In Cloudera Manager, configure HiveServer2 Load Balancer for the proxy server. See Enabling HiveServer2 High Availability Using Cloudera Manager:
    1. Enter values for hostname:port number.
    2. Click Save Changes to commit the changes.
    3. Restart the Hive service.
  5. Point all scripts, jobs, or application configurations to the new load balancer instead of any specific DataNode.