Step 4: Automating Failover with Corosync and Pacemaker

Corosync and Pacemaker are popular high-availability utilities that allow you to configure Cloudera Manager to fail over automatically.

This document describes one way to set up clustering using these tools. Actual setup can be done in several ways, depending on the network configuration of your environment.

About Corosync and Pacemaker
Setting up Cloudera Manager Server
Setting up the Cloudera Manager Service

Prerequisites:

Install Pacemaker and Corosync on CMS1, MGMT1, CMS2, and MGMT2, using the correct versions for your Linux distribution:
Note: The versions referred to for setting up automatic failover in this document are Pacemaker 1.1.11 and Corosync 1.4.7. See http://clusterlabs.org/wiki/Install to determine what works best for your Linux distribution.
RHEL/CentOS:
```
$ yum install pacemaker corosync
```
Ubuntu:
```
$ apt-get install pacemaker corosync
```
SUSE:
```
$ zypper install pacemaker corosync
```
Make sure that the crm tool exists on all of the hosts. This procedure uses the crm tool, which works with Pacemaker configuration. If this tool is not installed when you installed Pacemaker (verify this by running which crm), you can download and install the tool for your distribution using the instructions at http://crmsh.github.io/installation.

About Corosync and Pacemaker

By default, Corosync and Pacemaker are not autostarted as part of the boot sequence. Cloudera recommends leaving this as is. If the machine crashes and restarts, manually make sure that failover was successful and determine the cause of the restart before manually starting these processes to achieve higher availability.
- If the /etc/default/corosync file exists, make sure that START is set to yes in that file:
```
START=yes
```
- Make sure that Corosync is not set to start automatically, by running the following command:
  RHEL/CentOS/SUSE:
```
$ chkconfig corosync off
```
  Ubuntu:
```
$ update-rc.d -f corosync remove
```
Note which version of Corosync is installed. The contents of the configuration file for Corosync (corosync.conf) that you edit varies based on the version suitable for your distribution. Sample configurations are supplied in this document and are labeled with the Corosync version.
This document does not demonstrate configuring Corosync with authentication (with secauth set to on). The Corosync website demonstrates a mechanism to encrypt traffic using symmetric keys.
Firewall configuration:
Corosync uses UDP transport on ports 5404 and 5405, and these ports must be open for both inbound and outbound traffic on all hosts. If you are using IP tables, run a command similar to the following:
```
$ sudo iptables -I INPUT -m state --state NEW -p udp -m multiport --dports 5404,5405 -j ACCEPT
$ sudo iptables -I OUTPUT -m state --state NEW -p udp -m multiport --sports 5404,5405 -j ACCEPT
```

Setting up Cloudera Manager Server

Set up a Corosync cluster over unicast, between CMS1 and CMS2, and make sure that the hosts can “cluster” together. Then, set up Pacemaker to register Cloudera Manager Server as a resource that it monitors and to fail over to the secondary when needed.

Setting up Corosync

Edit the /etc/corosync/corosync.conf file on CMS1 and replace the entire contents with the following text (use the correct version for your environment):

Corosync version 1.x:

compatibility: whitetank
totem {
        version: 2
        secauth: off
        interface {
                member {
                        memberaddr: CMS1
                }
                member {
                        memberaddr: CMS2
                }
                ringnumber: 0
                bindnetaddr: CMS1
                mcastport: 5405
        }
        transport: udpu
}

logging {
        fileline: off
        to_logfile: yes
        to_syslog: yes
        logfile: /var/log/cluster/corosync.log
        debug: off
        timestamp: on
        logger_subsys {
                subsys: AMF
                debug: off
        }
}
service {
        # Load the Pacemaker Cluster Resource Manager
        name: pacemaker
        ver:  1
        #
}

Corosync version 2.x:

totem {
version: 2
secauth: off
cluster_name: cmf
transport: udpu
}

nodelist {
  node {
        ring0_addr: CMS1
        nodeid: 1
       }
  node {
        ring0_addr: CMS2
        nodeid: 2
       }
}

quorum {
provider: corosync_votequorum
two_node: 1
}

Edit the /etc/corosync/corosync.conf file on CMS2, and replace the entire contents with the following text (use the correct version for your environment):

Corosync version 1.x:

compatibility: whitetank
totem {
        version: 2
        secauth: off
        interface {
                member {
                        memberaddr: CMS1
                }
                member {
                        memberaddr: CMS2
                }
                ringnumber: 0
                bindnetaddr: CMS2
                mcastport: 5405
        }
        transport: udpu
}

logging {
        fileline: off
        to_logfile: yes
        to_syslog: yes
        logfile: /var/log/cluster/corosync.log
        debug: off
        timestamp: on
        logger_subsys {
                subsys: AMF
                debug: off
        }
}
service {
        # Load the Pacemaker Cluster Resource Manager
        name: pacemaker
        ver:  1
        #
}

Corosync version 2.x:

totem {
version: 2
secauth: off
cluster_name: cmf
transport: udpu
}

nodelist {
  node {
        ring0_addr: CMS1
        nodeid: 1
       }
  node {
        ring0_addr: CMS2
        nodeid: 2
       }
}

quorum {
provider: corosync_votequorum
two_node: 1
}

Restart Corosync on CMS1 and CMS2 so that the new configuration takes effect:
```
$ service corosync restart
```

Setting up Pacemaker

You use Pacemaker to set up Cloudera Manager Server as a cluster resource.

See the Pacemaker configuration reference at http://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/ for more details about Pacemaker options.

The following steps demonstrate one way, recommended by Cloudera, to configure Pacemaker for simple use:

Disable autostart for Cloudera Manager Server (because you manage its lifecycle through Pacemaker) on both CMS1 and CMS2:
RHEL/CentOS/SUSE:
```
$ chkconfig cloudera-scm-server off
```
Ubuntu:
```
$ update-rc.d -f cloudera-scm-server remove
```
Make sure that Pacemaker has been started on both CMS1 and CMS2:
```
$ /etc/init.d/pacemaker start
```

Make sure that crm reports two nodes in the cluster:

# crm status
Last updated: Wed Mar  4 18:55:27 2015
Last change: Wed Mar  4 18:38:40 2015 via crmd on CMS1
Stack: corosync
Current DC: CMS1 (1) - partition with quorum
Version: 1.1.10-42f2063
2 Nodes configured
0 Resources configured

Change the Pacemaker cluster configuration (on either CMS1 or CMS2):
```
$ crm configure property no-quorum-policy=ignore
$ crm configure property stonith-enabled=false
$ crm configure rsc_defaults resource-stickiness=100
```
These commands do the following:
- Disable quorum checks. (Because there are only two nodes in this cluster, quorum cannot be established.)
- Disable STONITH explicitly (see Enabling STONITH (Shoot the other node in the head)).
- Reduce the likelihood of the resource being moved among hosts on restarts.
Add Cloudera Manager Server as an LSB-managed resource (either on CMS1 or CMS2):
```
$ crm configure primitive cloudera-scm-server lsb:cloudera-scm-server
```

Verify that the primitive has been picked up by Pacemaker:

$ crm_mon

For example:

$ crm_mon
Last updated: Tue Jan 27 15:01:35 2015
Last change: Mon Jan 27 14:10:11 2015
Stack: classic openais (with plugin)
Current DC: CMS1 - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
1 Resources configured
Online: [ CMS1 CMS2 ]
cloudera-scm-server
(lsb:cloudera-scm-server):
Started CMS1

At this point, Pacemaker manages the status of the cloudera-scm-server service on hosts CMS1 and CMS2, ensuring that only one instance is running at a time.

Testing Failover with Pacemaker

Test Pacemaker failover by running the following command to move the cloudera-scm-server resource to CMS2:

$ crm resource move cloudera-scm-server <CMS2>

Test the resource move by connecting to a shell on CMS2 and verifying that the cloudera-scm-server process is now active on that host. It takes usually a few minutes for the new services to come up on the new host.

Enabling STONITH (Shoot the other node in the head)

The following link provides an explanation of the problem of fencing and ensuring (within reasonable limits) that only one host is running a shared resource at a time: http://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html-single/Clusters_from_Scratch/index.html#idm140603947390416

As noted in that link, you can use several methods (such as IPMI) to achieve reasonable guarantees on remote host shutdown. Cloudera recommends enabling STONITH, based on the hardware configuration in your environment.

Setting up the Cloudera Manager Service

Setting Up Corosync

Edit the /etc/corosync/corosync.conf file on MGMT1 and replace the entire contents with the contents below; make sure to use the correct section for your version of Corosync:

Corosync version 1.x:

compatibility: whitetank
totem {
        version: 2
        secauth: off
        interface {
                member {
                        memberaddr: MGMT1
                }
                member {
                        memberaddr: MGMT2
                }
                ringnumber: 0
                bindnetaddr: MGMT1
                mcastport: 5405
        }
        transport: udpu
}

logging {
        fileline: off
        to_logfile: yes
        to_syslog: yes
        logfile: /var/log/cluster/corosync.log
        debug: off
        timestamp: on
        logger_subsys {
                subsys: AMF
                debug: off
        }
}
service {
        # Load the Pacemaker Cluster Resource Manager
        name: pacemaker
        ver:  1
        #
}

Corosync version 2.x:

totem {
version: 2
secauth: off
cluster_name: mgmt
transport: udpu
}

nodelist {
  node {
        ring0_addr: MGMT1
        nodeid: 1
       }
  node {
        ring0_addr: MGMT2
        nodeid: 2
       }
}

quorum {
provider: corosync_votequorum
two_node: 1
}

Edit the /etc/corosync/corosync.conf file on MGMT2 andf replace the contents with the contents below:

Corosync version 1.x:

compatibility: whitetank
totem {
        version: 2
        secauth: off
        interface {
                member {
                        memberaddr: MGMT1
                }
                member {
                        memberaddr: MGMT2
                }
                ringnumber: 0
                bindnetaddr: MGMT2
                mcastport: 5405
        }
        transport: udpu
}

logging {
        fileline: off
        to_logfile: yes
        to_syslog: yes
        logfile: /var/log/cluster/corosync.log
        debug: off
        timestamp: on
        logger_subsys {
                subsys: AMF
                debug: off
        }
}
service {
        # Load the Pacemaker Cluster Resource Manager
        name: pacemaker
        ver:  1
        #
}

Corosync version 2.x:

totem {
version: 2
secauth: off
cluster_name: mgmt
transport: udpu
}

nodelist {
  node {
        ring0_addr: CMS1
        nodeid: 1
       }
  node {
        ring0_addr: CMS2
        nodeid: 2
       }
}

quorum {
provider: corosync_votequorum
two_node: 1
}

Restart Corosync on MGMT1 and MGMT2 for the new configuration to take effect:
```
$ service corosync restart
```

Test whether Corosync has set up a cluster, by using the corosync-cmapctl or corosync-objctl commands. You should see two members with status joined:

corosync-objctl | grep "member"
runtime.totem.pg.mrp.srp.members.1.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.1.ip (str) = r(0) ip(MGMT1)
runtime.totem.pg.mrp.srp.members.1.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.1.status (str) = joined
runtime.totem.pg.mrp.srp.members.2.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.2.ip (str) = r(0) ip(MGMT2)
runtime.totem.pg.mrp.srp.members.2.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.2.status (str) = joined

Setting Up Pacemaker

Use Pacemaker to set up Cloudera Management Service as a cluster resource.

See the Pacemaker configuration reference at http://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/ for more information about Pacemaker options.

Because the lifecycle of Cloudera Management Service is managed through the Cloudera Manager Agent, you configure the Cloudera Manager Agent to be highly available.

Follow these steps to configure Pacemaker, recommended by Cloudera for simple use:

Disable autostart for the Cloudera Manager Agent (because Pacemaker manages its lifecycle) on both MGMT1 and MGMT2:
RHEL/CentOS/SUSE
```
$ chkconfig cloudera-scm-agent off
```
Ubuntu:
```
$ update-rc.d -f cloudera-scm-agent remove
```
Make sure that Pacemaker is started on both MGMT1 and MGMT2:
```
$ /etc/init.d/pacemaker start
```

Make sure that the crm command reports two nodes in the cluster; you can run this command on either host:

# crm status
Last updated: Wed Mar  4 18:55:27 2015
Last change: Wed Mar  4 18:38:40 2015 via crmd on MGMT1
Stack: corosync
Current DC: MGMT1 (1) - partition with quorum
Version: 1.1.10-42f2063
2 Nodes configured
0 Resources configured

Change the Pacemaker cluster configuration on either MGMT1 or MGMT2:
```
$ crm configure property no-quorum-policy=ignore
$ crm configure property stonith-enabled=false
$ crm configure rsc_defaults resource-stickiness=100
```
As with Cloudera Manager Server Pacemaker configuration, this step disables quorum checks, disables STONITH explicitly, and reduces the likelihood of resources being moved between hosts.

Create an Open Cluster Framework (OCF) provider on both MGMT1 and MGMT2 for Cloudera Manager Agent for use with Pacemaker:

Create an OCF directory for creating OCF resources for Cloudera Manager:
```
$ mkdir -p /usr/lib/ocf/resource.d/cm
```

Create a Cloudera Manager Agent OCF wrapper as a file at /usr/lib/ocf/resource.d/cm/agent, with the following content, on both MGMT1 and MGMT2:

RHEL-compatible 7 and higher:

#!/bin/sh
#######################################################################
# CM Agent OCF script
#######################################################################
#######################################################################
# Initialization:
: ${__OCF_ACTION=$1}
OCF_SUCCESS=0
OCF_ERROR=1
OCF_STOPPED=7
#######################################################################

meta_data() {
        cat <<END
<?xml version="1.0"?>
<!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd">
<resource-agent name="Cloudera Manager Agent" version="1.0">
<version>1.0</version>

<longdesc lang="en">
 This OCF agent handles simple monitoring, start, stop of the Cloudera
 Manager Agent, intended for use with Pacemaker/corosync for failover.
</longdesc>
<shortdesc lang="en">Cloudera Manager Agent OCF script</shortdesc>

<parameters />

<actions>
<action name="start"        timeout="20" />
<action name="stop"         timeout="20" />
<action name="monitor"      timeout="20" interval="10" depth="0"/>
<action name="meta-data"    timeout="5" />
</actions>
</resource-agent>
END
}

#######################################################################

agent_usage() {
cat <<END
 usage: $0 {start|stop|monitor|meta-data}
 Cloudera Manager Agent HA OCF script - used for managing Cloudera Manager Agent and managed processes lifecycle for use with Pacemaker.
END
}

agent_start() {
    service cloudera-scm-agent start
    if [ $? =  0 ]; then
        return $OCF_SUCCESS
    fi
    return $OCF_ERROR
}

agent_stop() {
    service cloudera-scm-agent next_stop_hard
    service cloudera-scm-agent stop
    if [ $? =  0 ]; then
        return $OCF_SUCCESS
    fi
    return $OCF_ERROR
}

agent_monitor() {
        # Monitor _MUST!_ differentiate correctly between running
        # (SUCCESS), failed (ERROR) or _cleanly_ stopped (NOT RUNNING).
        # That is THREE states, not just yes/no.
        service cloudera-scm-agent status
        if [ $? = 0 ]; then
            return $OCF_SUCCESS
        fi
        return $OCF_STOPPED
}


case $__OCF_ACTION in
meta-data)      meta_data
                exit $OCF_SUCCESS
                ;;
start)          agent_start;;
stop)           agent_stop;;
monitor)        agent_monitor;;
usage|help)     agent_usage
                exit $OCF_SUCCESS
                ;;
*)              agent_usage
                exit $OCF_ERR_UNIMPLEMENTED
                ;;
esac
rc=$?
exit $rc

All other Linux distributions:

#!/bin/sh
#######################################################################
# CM Agent OCF script
#######################################################################
#######################################################################
# Initialization:
: ${__OCF_ACTION=$1}
OCF_SUCCESS=0
OCF_ERROR=1
OCF_STOPPED=7
#######################################################################

meta_data() {
        cat <<END
<?xml version="1.0"?>
<!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd">
<resource-agent name="Cloudera Manager Agent" version="1.0">
<version>1.0</version>

<longdesc lang="en">
 This OCF agent handles simple monitoring, start, stop of the Cloudera
 Manager Agent, intended for use with Pacemaker/corosync for failover.
</longdesc>
<shortdesc lang="en">Cloudera Manager Agent OCF script</shortdesc>

<parameters />

<actions>
<action name="start"        timeout="20" />
<action name="stop"         timeout="20" />
<action name="monitor"      timeout="20" interval="10" depth="0"/>
<action name="meta-data"    timeout="5" />
</actions>
</resource-agent>
END
}

#######################################################################

agent_usage() {
cat <<END
 usage: $0 {start|stop|monitor|meta-data}
 Cloudera Manager Agent HA OCF script - used for managing Cloudera Manager Agent and managed processes lifecycle for use with Pacemaker.
END
}

agent_start() {
    service cloudera-scm-agent start
    if [ $? =  0 ]; then
        return $OCF_SUCCESS
    fi
    return $OCF_ERROR
}

agent_stop() {
    service cloudera-scm-agent hard_stop_confirmed
    if [ $? =  0 ]; then
        return $OCF_SUCCESS
    fi
    return $OCF_ERROR
}

agent_monitor() {
        # Monitor _MUST!_ differentiate correctly between running
        # (SUCCESS), failed (ERROR) or _cleanly_ stopped (NOT RUNNING).
        # That is THREE states, not just yes/no.
        service cloudera-scm-agent status
        if [ $? = 0 ]; then
            return $OCF_SUCCESS
        fi
        return $OCF_STOPPED
}


case $__OCF_ACTION in
meta-data)      meta_data
                exit $OCF_SUCCESS
                ;;
start)          agent_start;;
stop)           agent_stop;;
monitor)        agent_monitor;;
usage|help)     agent_usage
                exit $OCF_SUCCESS
                ;;
*)              agent_usage
                exit $OCF_ERR_UNIMPLEMENTED
                ;;
esac
rc=$?
exit $rc

Run chmod on that file to make it executable:

$ chmod 770 /usr/lib/ocf/resource.d/cm/agent

Test the OCF resource script:
```
$ /usr/lib/ocf/resource.d/cm/agent monitor
```
This script should return the current running status of the SCM agent.
Add Cloudera Manager Agent as an OCF-managed resource (either on MGMT1 or MGMT2):
```
$ crm configure primitive cloudera-scm-agent ocf:cm:agent
```

Verify that the primitive has been picked up by Pacemaker by running the following command:

$ crm_mon

For example:

>crm_mon
Last updated: Tue Jan 27 15:01:35 2015
Last change: Mon Jan 27 14:10:11 2015ls /
Stack: classic openais (with plugin)
Current DC: CMS1 - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
1 Resources configured
Online: [ MGMT1 MGMT2 ]
cloudera-scm-agent
(ocf:cm:agent):
Started MGMT2

Pacemaker starts managing the status of the cloudera-scm-agent service on hosts MGMT1 and MGMT2, ensuring that only one instance is running at a time.

Testing Failover with Pacemaker

Test that Pacemaker can move resources by running the following command, which moves the cloudera-scm-agent resource to MGMT2:

$ crm resource move cloudera-scm-agent MGMT2

Test the resource move by connecting to a shell on MGMT2 and verifying that the cloudera-scm-agent and the associated Cloudera Management Services processes are now active on that host. It usually takes a few minutes for the new services to come up on the new host.

Step 3: Installing and Configuring Cloudera Management Service for High Availability

Database High Availability Configuration