In Parts 1 and 2, we covered the basics of YARN resource allocation. In this installment, we’ll provide an overview of cluster scheduling and introduce the Fair Scheduler, one of the scheduler choices available in YARN.
A standalone computer can have several CPU cores, each running a single process, but there can be as many as a few hundred processes running simultaneously. The scheduler is a part of the desktop’s operating system that assigns a process to a CPU core to run for a short period of time.
As described previously in “Part 1: YARN and Cluster Basics”, an application consists of multiple tasks (usually on different hosts) on the cluster. A cluster scheduler essentially has to address:
The ResourceManager (RM) tracks resources on a cluster, and assigns them to applications that need them. The scheduler is that part of the RM that does this matching honoring organizational policies on sharing resources. Please note that:
The Fair Scheduler is a popular choice (recommended by Cloudera) among the schedulers YARN supports. In its simplest form, it shares resources fairly among all jobs running on the cluster. The next few sections explain scheduler internals in the context of Fair Scheduler and elaborate on the commonly used controls the scheduler offers.
Queues are the organizing structure for YARN schedulers, allowing multiple tenants to share the cluster. As applications are submitted to YARN, they are assigned to a queue by the scheduler. The root queue is the parent of all queues. All other queues are each a child of the root queue or another queue (also called hierarchical queues). It is common for queues to correspond to users, departments, or priorities.
Figure 1 presents a basic Fair Scheduler example of fair-scheduler.xml and a graphical representation of each queue’s share of the cluster.
Figure 1: Example part of fair-scheduler.xml and corresponding fair shares of each queue
One level of queues allows sharing the cluster along one dimension (e.g. one queue per team), but it is common to share clusters among multiple dimensions (e.g. per-team and priority). Fair Scheduler allows nesting queues to form a hierarchical queue structure, where each level could correspond to a dimension under its parent queue.
For example, Figure 2 below shows resources shared along team and priority dimensions. The first level corresponds to the team (e.g. root.datascience), and each first-level queue could have a high- and low-priority child queues for jobs from that specific team (e.g. root.datascience.short_jobs and root.datascience.best_effort_jobs).
In the rest of this post and subsequent parts, we refer to queue names in a different font like root or root.sales.northamerica. Where unambiguous, we use the shortest version possible, i.e. northamerica instead of root.sales.northamerica.
In the sales queue, there are two child queues: northamerica and europe. Each has a weight of 30.0, so the fair share for each of the child queues within sales is effectively 50%.
In the marketing queue, there are two child queues of nonequal weight: reports and website. This means that jobs in the reports queue are allocated twice as many resources as jobs in the website queue. However, together, their weight is still governed by the marketing queue’s weight.
Queue weights are used to determine the fair share for a queue. The Fair Scheduler starts from the root queue and looks at the weights of all immediate child queues to determine their fair share. Each child queue’s fair share is further evaluated for its set of child queues.
The marketing queue has a weight of 3.0, the sales queue has a weight of 4.0, and the datascience queue has a weight of 13.0. So, the allocation from the root will be 15% to marketing, 20% to sales, and 65% to datascience.
Of the share that goes to datascience, all of the queue’s allocation goes to the short_jobs queue. If there are no jobs assigned to the short_jobs queue, then the jobs in the best_effort_jobs queue are allocated resources.
Figure 2: Example part of fair-scheduler.xml and corresponding fair shares of hierarchical queues
To get more information about Fair Scheduler, take a look at the online documentation.
Part 4 will cover Fair Scheduler queue properties in greater detail. You can configure these properties to fully customize Fair Scheduler for your needs.
Ray Chiang is a Software Engineer at Cloudera.
Dennis Dawson is a Senior Technical Writer at Cloudera.
This may have been caused by one of the following: