Oozie High Availability

In CDH 5, you can configure multiple active Oozie servers against the same database. Oozie high availability is "active-active" or "hot-hot" so that both Oozie servers are active at the same time, with no failover. High availability for Oozie is supported in both MRv1 and MRv2 (YARN).

Requirements for Oozie High Availability

  • Multiple active Oozie servers, preferably identically configured.
  • JDBC JAR in the same location across all Oozie hosts (for example, /var/lib/oozie/).
  • External database that supports multiple concurrent connections, preferably with HA support. The default Derby database does not support multiple concurrent connections.
  • ZooKeeper ensemble with distributed locks to control database access, and service discovery for log aggregation.
  • Load balancer (preferably with HA support, for example HAProxy), virtual IP, or round-robin DNS to provide a single entry point (of the multiple active servers), and for callbacks from the Application Master or JobTracker.
To enable Kerberos authentication, see Enabling Kerberos Authentication Using the Wizard.
For information on setting up TLS/SSL communication with Oozie HA enabled, see Additional Considerations when Configuring TLS/SSL for Oozie HA.

Configuring Oozie High Availability Using Cloudera Manager

Minimum Required Role: Full Administrator

Enabling Oozie High Availability

  1. Ensure that the requirements are satisfied.
  2. In the Cloudera Manager Admin Console, go to the Oozie service.
  3. Select Actions > Enable High Availability to see eligible Oozie server hosts. The host running the current Oozie server is not eligible.
  4. Select the host on which to install an additional Oozie server and click Continue.
  5. Enter the FQDN and port number of the Oozie load balancer. For example:
    load-bal.example.com:12345
  6. Click Continue.

Cloudera Manager stops the Oozie servers, adds another Oozie server, initializes the Oozie server High Availability state in ZooKeeper, configures Hue to reference the Oozie load balancer, and restarts the Oozie servers and dependent services. In addition, Cloudera Manager generates Kerberos credentials for the new Oozie server and regenerates credentials for existing servers.

Disabling Oozie High Availability

  1. In the Cloudera Manager Admin Console, go to the Oozie service.
  2. Select Actions > Disable High Availability to see all hosts currently running Oozie servers.
  3. Select the one host to run the Oozie server and click Continue. Cloudera Manager stops the Oozie service, removes the additional Oozie servers, configures Hue to reference the Oozie service, and restarts the Oozie service and dependent services.

Configuring Oozie High Availability Using the Command Line

For installation and configuration instructions for configuring Oozie HA using the command line, see https://archive.cloudera.com/cdh5/cdh/5/oozie.

To enable Kerberos authentication for an Oozie HA-enabled deployment, see Configuring Oozie HA with Kerberos.