YARN (MRv2) and MapReduce (MRv1) Schedulers

A scheduler determines which jobs run, where and when they run, and the resources allocated to the jobs. The YARN (MRv2) and MapReduce (MRv1) computation frameworks support the following schedulers:

  • FIFO - Allocates resources based on arrival time.
  • Fair - Allocates resources to weighted pools, with fair sharing within each pool. When configuring the scheduling policy of a pool, Domain Resource Fairness (DRF) is a type of fair scheduler.
  • Capacity - Allocates resources to pools, with FIFO scheduling within each pool.

The scheduler defaults for YARN and MapReduce are:

  • YARN - Cloudera Manager and CDH 5 set the default to Fair Scheduler. Cloudera recommends Fair Scheduler. FIFO and Capacity Scheduler are also available.

    In YARN, the scheduler is responsible for allocating resources to the various running applications subject to familiar constraints of capacities, queues, and so on. The scheduler performs its scheduling function based on resource requirements of the applications; it does so based on the abstract notion of a resource container that incorporates elements such as memory, CPU, disk, and network.

    The YARN scheduler has a pluggable policy, which is responsible for partitioning cluster resources among the various queues, applications, and so on.

    If you are running CDH 5, you can manually configure the scheduler type. If you choose the Fair Scheduler, see Configuring the Fair Scheduler for information on how to manually configure it. Alternatively you can use Cloudera Manager dynamic allocation to manage scheduler configuration.

  • MapReduce - Cloudera Manager and CDH 5 set the default scheduler to FIFO for backward compatibility, however Cloudera recommends Fair Scheduler. Capacity Scheduler is also available.