This is the documentation for Cloudera Manager 4.8.5.
Documentation for other versions is available at Cloudera Documentation.

Setting up a Multi-tenant Cluster for Impala and MapReduce

MapReduce and Impala often work with the same data set and run side-by-side on the same physical hardware. We call such clusters with multiple active compute frameworks "multi-tenant" clusters. It is important to control the amount of resources assigned to each compute framework. Otherwise, Impala queries and MapReduce jobs may suffer from conflicting resource demands, leading to poor performance for both components. There is no standard resource allocation that fits all requirements. This document illustrates the mechanisms and considerations behind resource management. You should determine what resource allocation your workloads need to meet your production SLAs.

Our resource management controls cover CPU, disk IO, and memory. For simplicity, we assign fraction x of all resources to Impala, and the rest (fraction 1-x) to MapReduce. With this assignment, we roughly expect Impala multi-tenant performance to be fraction x of Impala stand-alone performance, i.e., Impala running by itself on the cluster. Likewise, we roughly expect MapReduce multi-tenant performance to be fraction 1-x of MapReduce stand-alone performance.

We make the following assumptions for the examples in the rest of the document:

  • MapReduce and Impala are the only two active compute frameworks.

  • Each slave node is running a Datanode, Tasktracker, and an Impala daemon.

  • The cluster uses homogenous hardware.

  • Each slave node has 60 GB of RAM.

  • Each slave node has 16 logical processors.

The actual resource management mechanisms are

  • Memory: Impala Daemon Memory Limit, Maximum Number of Simultaneous Map Tasks, Maximum Number of Simultaneous Reduce Tasks.

  • CPU: Cgroup CPU shares, Maximum Number of Simultaneous Map Tasks, Maximum Number of Simultaneous Reduce Tasks.

  • Disk: Cgroup IO weight, Maximum Number of Simultaneous Map Tasks, Maximum Number of Simultaneous Reduce Tasks.

For instructions on configuring and enabling Cgroups, see Resource Management.

Example: Assign 50% of resources to Impala

We need to set numerical values for the following parameters to assign 50% of resources to Impala and to MapReduce:

  • Memory: Impala Daemon Memory Limit, Maximum Number of Simultaneous Map Tasks, Maximum Number of Simultaneous Reduce Tasks.

  • CPU: Cgroup CPU Shares, Maximum Number of Simultaneous Map Tasks, Maximum Number of Simultaneous Reduce Tasks.

  • Disk: Cgroup IO Weight, Maximum Number of Simultaneous Map Tasks, Maximum Number of Simultaneous Reduce Tasks.

These parameters are associated with the configurations of the Impala Daemon, DataNode, and TaskTracker roles in Cloudera Manager.

We compute the numerical values using the rationale below:

Memory

We set Impala Daemon Memory Limit to 50% of the effective memory size, which we assume to be (RAM size) / 1.5. The divisor 1.5 is a conservative estimate to adjust for OS accounting overhead. For our numerical example, this works out to 60GB / 1.5 * 50% = 20GB.

We indirectly control MapReduce memory consumption by cutting to 50% the maximum number of simultaneous map and reduce tasks. For CM, the default maximum number of simultaneous map tasks is set to the number of CPU logical processors, 16 in our example, and maximum number of simultaneous reduce tasks is set to half the number of CPU logical processors. This works out to 16 * 50% = 8 max map tasks, and 16 / 2 * 50% = 4 max reduce tasks.

Note that with this change, any per-job tuning related to slot capacity should be adjusted also.

CPU

The more Cgroup CPU Shares given to a role, the larger its share of the CPU when under contention. Until processes on the host (including both roles managed by Cloudera Manager and other system processes) are contending for all of the CPUs, this will have no effect. When there is contention, those processes with higher CPU shares will be given more CPU time. The effect is linear: a process with 4 CPU shares will be given roughly twice as much CPU time as a process with 2 CPU shares.

In a multi-tenant workload, all three daemons - Impala, DataNodes & TaskTrackers - may be simultaneously consuming CPU. We assign 50% of the total CPU shares to the Impala daemon. We assign 25% each to the DataNodes and the TaskTracker. This is a very conservative setting. It assumes that DataNode and task computational activity can simultaneously maximize CPU activity. If your MapReduce workload is far below cluster capacity, this is unlikely to be true.

Disk

The more the Cgroup I/O Weight, the higher priority will be given to I/O requests made by the role when I/O is under contention (either by roles managed by Cloudera Manager or by other system processes). The effect is much like Cgroup CPU Shares. Note that Cgroup I/O Weight only prioritizes read requests; write requests remain unprioritized.

The rationale for the numerical values for Cgroup I/O Weight is identical to that for Cgroup CPU Shares, i.e., we ensure Impala daemon has 50% of the sum Cgroup I/O Weight for (Impala daemon + DataNode + TaskTracker), and DataNode and TaskTracker have equal weights that sum up to the remaining 50%.

Follow the steps below to set the numerical values:

Impala daemon

  1. Go to the Impala service configuration edit screen.
  2. Select Impala Daemon (base), Resource Management tab.
  3. Set Impala Daemon Memory Limit to 20GB.
  4. Set Cgroup CPU Shares to 2048.
  5. Set Cgroup I/O Weight to 1000.
  6. Click the Save Changes button.
  7. Repeat Steps 2-6 for all Impala Daemon role configuration groups, e.g., Impala Daemon (1).

DataNode

  1. Go to the HDFS service configuration edit screen.
  2. Select DataNode (base), Resource Management tab.
  3. Set Cgroup CPU Shares to 1024.
  4. Set Cgroup I/O Weight to 500.
  5. Click the Save Changes button.
  6. Repeat Steps 2-5 for all DataNode role configuration groups, e.g., DataNode (1).

TaskTracker

  1. Go to the MapReduce service configuration edit screen.
  2. Select TaskTracker (base), Resource Management tab.
  3. Set Cgroup CPU Shares to 1024.
  4. Set Cgroup I/O Weight to 500.
  5. Select TaskTracker (base), Performance tab.
  6. Set Maximum Number of Simultaneous Map Tasks to 8.
  7. Set Maximum Number of Simultaneous Reduce Tasks to 4.
  8. Click the Save Changes button.
  9. Repeat Steps 2-8 for all TaskTracker role configuration groups, e.g., TaskTracker (1).

Restart all services for these changes to take effect.

Further Examples

For convenience, we provide here some numerical tables for configurations that assign to Impala 25%, 50%, and 75% of all resources, while MapReduce receives the rest.

Again, there is no standard resource allocation that fits all requirements. You should determine what resource allocation your workloads need to meet your production SLAs. For example, as Impala is often memory-bound and MapReduce often IO-bound, you can consider increasing the Impala memory shares and MapReduce IO-shares from the starting points below.

Impala gets 25%, MapReduce gets 75%

  Impala daemon DataNode TaskTracker
Cgroup CPU Shares 683 1024 1024
Cgroup I/O Weight 333 500 500
Impala Daemon Memory Limit (RAM size) / 1.5 * 0.25 n/a n/a
Maximum Map Tasks n/a n/a (# logical processors) * 0.75
Maximum Reduce Tasks n/a n/a (# logical processors) / 2 * 0.75

Impala gets 50%, MapReduce gets 50%

  Impala daemon DataNode TaskTracker
Cgroup CPU Shares 2048 1024 1024
Cgroup I/O Weight 1000 500 500
Impala Daemon Memory Limit (RAM size) / 1.5 * 0.5 n/a n/a
Maximum Map Tasks n/a n/a (# logical processors) * 0.5
Maximum Reduce Tasks n/a n/a (# logical processors) / 2 * 0.5

Impala gets 75%, MapReduce gets 25%

  Impala daemon DataNode TaskTracker
Cgroup CPU Shares 6144 1024 1024
Cgroup I/O Weight 1000 167 167
Impala Daemon Memory Limit (RAM size) / 1.5 * 0.75 n/a n/a
Maximum Map Tasks n/a n/a (# logical processors) * 0.25
Maximum Reduce Tasks n/a n/a (# logical processors) / 2 * 0.25

More About Cgroups

See Resource Management for more information about Cgroups, including Linux distribution support and known issues.