When Cloudera Manager configures a service, it configures host machines in your cluster with one or more functions (called roles in Cloudera Manager) that are required for that service. The role determines which Hadoop daemons run on a given host. For example, when Cloudera Manager configures an HDFS service instance it configures one host to run the NameNode role, another host to run as the Secondary NameNode role, another host to run the Balancer role, and some or all of the remaining hosts as to run DataNode roles.
The configuration settings for a particular role type are organized in role groups. A role group includes a set of configuration properties for a specific role type, as well as a list of role instances associated with that role group. Cloudera Manager automatically creates a default role group for each role type.
For role types that allow multiple instances on multiple nodes, such as DataNodes, TaskTrackers, RegionServers (and many others), you can create multiple role groups to allow one set of role instances to use different configuration settings than another set of instances of the same role type. In fact, upon initial cluster setup, if you are installing on identical hosts with limited memory, Cloudera Manager will (typically) automatically create two role groups for each slave role — one group for the role instances on hosts with only other slave roles, and a separate group for the instance running on the host that is also hosting master roles.
The HDFS service is an example of this: Cloudera Manager typically creates one role group (DataNode (Default)) for the DataNode role instances running on the slave hosts, and another group (HDFS-1-DATANODE-1) for the DataNode instance running on the host that is also running the master roles such as the NameNode, JobTracker, HBase Master and so on. Typically the configurations for those two classes of hosts will differ in terms of settings such as memory for JVMs.
|<< Renaming a Service||Adding Role Instances >>|