Before You Install CDH 5 on a Cluster

Before you install CDH 5 on a cluster, there are some important steps you need to do to prepare your system:

  1. Verify you are using a supported operating system for CDH 5. See CDH 5 Requirements and Supported Versions.
  2. If you haven't already done so, install the Oracle Java Development Kit. For instructions and recommendations, see Java Development Kit Installation.

Scheduler Defaults

Note the following differences between MRv1 (MapReduce) and MRv2 (YARN).

  • MRv1 (MapReduce v1):
    • Cloudera Manager and CDH 5 set the default to FIFO.
    FIFO is set as the default for backward-compatibility purposes, but Cloudera recommends Fair Scheduler. Capacity Scheduler is also available.
  • MRv2 (YARN):
    • Cloudera Manager and CDH 5 set the default to Fair Scheduler.
    Cloudera recommends Fair Scheduler because Impala and Llama are optimized for it. FIFO and Capacity Scheduler are also available.

High Availability

In CDH 5 you can configure high availability both for the NameNode and the JobTracker or Resource Manager.