Configuring init to Start Hadoop System Services

init(8) starts some daemons when the system is booted. Depending on the distribution, init executes scripts from either the /etc/init.d directory or the /etc/rc2.d directory. The CDH packages link the files in init.d and rc2.d so that modifying one set of files automatically updates the other.

To start system services at boot time and on restarts, enable their init scripts on the systems on which the services will run, using the appropriate tool:

  • chkconfig is included in the RHEL and CentOS distributions. Debian and Ubuntu users can install the chkconfig package.
  • update-rc.d is included in the Debian and Ubuntu distributions.

Configuring init to Start Core Hadoop System Services in an MRv1 Cluster

The chkconfig commands to use are:

$ sudo chkconfig hadoop-hdfs-namenode on

The update-rc.d commands to use on Ubuntu and Debian systems are:

Where

Command

On the NameNode

$ sudo update-rc.d hadoop-hdfs-namenode defaults

On the JobTracker

$ sudo update-rc.d hadoop-0.20-mapreduce-jobtracker defaults

On the Secondary NameNode (if used)

$ sudo update-rc.d hadoop-hdfs-secondarynamenode defaults

On each TaskTracker

$ sudo update-rc.d hadoop-0.20-mapreduce-tasktracker defaults

On each DataNode

$ sudo update-rc.d hadoop-hdfs-datanode defaults

Configuring init to Start Core Hadoop System Services in a YARN Cluster

The chkconfig commands to use are:

Where

Command

On the NameNode

$ sudo chkconfig hadoop-hdfs-namenode on

On the ResourceManager

$ sudo chkconfig hadoop-yarn-resourcemanager on

On the Secondary NameNode (if used)

$ sudo chkconfig hadoop-hdfs-secondarynamenode on

On each NodeManager

$ sudo chkconfig hadoop-yarn-nodemanager on

On each DataNode

$ sudo chkconfig hadoop-hdfs-datanode on

On the MapReduce JobHistory node

$ sudo chkconfig hadoop-mapreduce-historyserver on

The update-rc.d commands to use on Ubuntu and Debian systems are:

Where

Command

On the NameNode

$ sudo update-rc.d hadoop-hdfs-namenode defaults

On the ResourceManager

$ sudo update-rc.d hadoop-yarn-resourcemanager defaults

On the Secondary NameNode (if used)

$ sudo update-rc.d hadoop-hdfs-secondarynamenode defaults

On each NodeManager

$ sudo update-rc.d hadoop-yarn-nodemanager defaults

On each DataNode

$ sudo update-rc.d hadoop-hdfs-datanode defaults

On the MapReduce JobHistory node

$ sudo update-rc.d hadoop-mapreduce-historyserver defaults

Configuring init to Start Non-core Hadoop System Services

Non-core Hadoop daemons can also be configured to start at init time using the chkconfig or update-rc.d command.

The chkconfig commands are:

Component

Server

Command

Hue

Hue server

$ sudo chkconfig hue on

Oozie

Oozie server

$ sudo chkconfig oozie on

HBase

HBase master

$ sudo chkconfig hbase-master on

 

On each HBase slave

$ sudo chkconfig hbase-regionserver on

Hive Metastore

Hive Metastore server

$ sudo chkconfig hive-metastore  on
HiveServer2 HiveServer2

$ sudo chkconfig hive-server2 on

Zookeeper

Zookeeper server

$ sudo chkconfig zookeeper-server on

HttpFS

HttpFS server

$ sudo chkconfig hadoop-httpfs on

The update-rc.d commands to use on Ubuntu and Debian systems are:

Component

Server

Command

Hue

Hue server

$ sudo update-rc.d hue defaults

Oozie

Oozie server

$ sudo update-rc.d oozie defaults

HBase

HBase master

$ sudo update-rc.d hbase-master defaults

 

HBase slave

$ sudo update-rc.d hbase-regionserver defaults

Hive Metastore

Hive Metastore server

$ sudo update-rc.d hive-metastore  defaults

HiveServer2

HiveServer2

$ sudo update-rc.d hive-server2 defaults

Zookeeper

Zookeeper server

$ sudo update-rc.d zookeeper-server defaults

HttpFS

HttpFS server

$ sudo update-rc.d hadoop-httpfs defaults