About JobTracker High Availability (HA)
If you are running MRv1, you can configure the JobTracker to be highly available. You can configure either manual or automatic failover to a warm-standby JobTracker.
- No equivalent for JobTracker HA is available for YARN at present.
- As with HDFS HA, the JobTracker high availability feature is backward compatible; that is, if you do not want to enable JobTracker high availability, you can simply keep your existing configuration after updating your hadoop-0.20-mapreduce, hadoop-0.20-mapreduce-jobtracker, and hadoop-0.20-mapreduce-tasktracker packages, and start your services as before. You do not need to perform any of the actions described on this page.
To use the high availability feature, you must create a new configuration. This new configuration is designed such that all the nodes in the cluster can have the same configuration; you do not need to deploy different configuration files to different nodes depending on each node's role in the cluster.
In an HA setup, the mapred.job.tracker property is no longer a host:port string, but instead specifies a logical name to identify JobTracker instances in the cluster (active and standby). Each distinct JobTracker in the cluster has a different JobTracker ID. To support a single configuration file for all of the JobTrackers, the relevant configuration parameters are suffixed with the JobTracker logical name as well as the JobTracker ID.
The HA JobTracker is packaged separately from the original (non-HA) JobTracker.
You cannot run both HA and non-HA JobTrackers in the same cluster. Do not install the HA JobTracker unless you need a highly available JobTracker. If you install the HA JobTracker and later decide to revert to the non-HA JobTracker, you will need to uninstall the HA JobTracker and re-install the non-HA JobTracker.
Use the sections that follow to install, configure and test JobTracker HA.