This guide is for Apache Hadoop system administrators who want to enable continuous availability by configuring clusters without single points of failure.
Not all Hadoop components currently support highly availability configurations. However, some currently SPOF (single point of failure) components can be configured to restart automatically in the event of a failure (Auto-Restart Configurable, in the table below). Some components support high availability implicitly because they comprise distributed processes (identified with an asterisk (*) in the table). In addition, some components depend on external databases which must also be configured to support high availability.
|High Availability||Auto-Restart Configurable||Components with External Databases|
|Alert Publisher||Hive Metastore (not possible with Sentry enabled)||Activity Monitor|
|Cloudera Manager Agent*||Impala catalog service||Cloudera Navigator Audit Server|
|Cloudera Manager Server||Impala statestore||Cloudera Navigator Metadata Server|
|Data Node*||Sentry Service||Hive Metastore Server|
|Event Server||Spark Job History Server||Oozie Server|
|Flume*||YARN Job History Server||Reports Manager|
|HBase Master||Sentry Server|
|Host Monitor||Sqoop Server|
|Hue (add multiple services, use load balancer)|
|Impalad* (add multiple services, use load balancer)|
|Navigator Key Trustee|