Adding Role Instances

After creating a service using one of the wizards, you can add a role instance to that service. For example, after initial installation in which you added the HDFS service, you can add a DataNode to a host where one was not previously running.

In a CDH4 cluster, some services provide roles that are not available with CDH3. For example, HDFS in CDH4 supports HttpFS, so that role is available as part of the HDFS service.

CDH4 HDFS also now provides a Failover Controller role, which is added to the HDFS service as a companion to each NameNode when you enable Automatic Failover after enabling High Availability. It is recommended that you let Cloudera Manager add this role as appropriate, rather than adding it manually.

There is also a new role called Gateway available in both CDH3 and CDH4 clusters for the HDFS, MapReduce, and HBase services (and for YARN in CDH4) . You can add a Gateway role to a host that does not otherwise have a CDH service installed — this enables Cloudera Manager to install and manage client configurations on that host. This is a convenient way to manage configurations on your CDH clients. There is no process associated with a Gateway role, and its status will always be Stopped.

To add a role instance:

  1. Click the Services tab.
  2. Click the link for the service for which you want to add a role instance. For example, click the HDFS service link if you want to add a DataNode role instance for that HDFS service.
  3. Click the Instances tab.
  4. Click the Add button.
  5. Follow the instructions in the wizard to add the role instance. During the wizard, Cloudera Manager will list the existing roles on hosts, recommend configuration settings such as data directory paths and heap sizes depending on the roles. If the new roles are assigned to the same host as roles of another service, Cloudera Manager will recommend configuration changes such that heap allocations of all the roles on the host can be accommodated. You can change some settings, such as data directory paths, before continuing. If you click Continue after making changes, the new roles will be created with your configuration changes and configuration settings will be made. If you click Skip, the new roles will be created with the recommended changes. If necessary, you can reconfigure the new roles later by navigating to the Configurations page of each role or of the service that these roles belong to.
  6. The wizard finishes by performing any actions necessary to prepare the cluster for the new role instances. For example, new DataNodes are added to the NameNode's dfs_hosts_allow.txt file. The new role instance is configured with the Base role group for its role type, even if there are multiple role groups for the role type. If you want to use a different role group, you can go to the Role Groups page under the Configuration tab for the service, and move the role instance to a different role group. See Managing Role Groups for more information.
  7. The new role instances are not started automatically. You can start them on the service's Instances page.

Adding ZooKeeper Roles

If you add ZooKeeper nodes to an existing ZooKeeper service, you must initialize the data directories for the new nodes (role instances) before you restart the ZooKeeper service.

  1. Add new ZooKeeper role instances as described in the steps above.
  2. Go to the Instances tab for the ZooKeeper service. Your newly added roles should show their status as Stopped.
  3. From the Actions menu at the top of the page, select Initialize....
  4. Confirm that you want to perform this action. Note that the dialog will inform you that the action cannot be performed on your previously-existing ZooKeeper nodes.
  5. When this action has completed, you can then restart the ZooKeeper service. This will start the new nodes as well as update the configuration for the existing nodes.

When you start the ZooKeeper service after you have added new nodes, the original node will have the datastore, but the datastores of the new nodes will be empty. Therefore, you must ensure that the original node is included when the new quorum is started up. If the new nodes are able to form a quorum without the original node being included, then the ensemble will have an empty datastore. You can avoid this by starting up just the original node plus one of the new nodes and allowing those to form a quorum, resulting a quorum with the datastore from the original node. You can then add the other new nodes.