Upgrading to CDH 5.1 Using Packages

Minimum Required Role: Cluster Administrator (also provided by Full Administrator)

If you installed or upgraded to CDH 5 using packages, you can upgrade to CDH 5.1 using either packages or parcels. Using parcels is recommended, because the upgrade wizard for parcels handles the upgrade almost completely automatically.

Before You Begin

  • Read the CDH 5 Release Notes.
  • Read the Cloudera Manager 5 Release Notes.
  • Ensure Java 1.7 is installed across the cluster. For installation instructions and recommendations, see Java Development Kit Installation, and make sure you have read Known Issues and Workarounds in Cloudera Manager 5 before you proceed with the upgrade.
  • Ensure that the Cloudera Manager minor version is equal to or greater than the CDH minor version. For example:
    Target CDH Version Minimum Cloudera Manager Version
    5.0.5 5.0.x
    5.1.4 5.1.x
    5.4.1 5.4.x
  • Make sure there are no Oozie workflows in RUNNING or SUSPENDED status; otherwise the Oozie database upgrade will fail and you will have to reinstall CDH 4 to complete or kill those running workflows.
  • Whenever upgrading Impala, whether in CDH or a standalone parcel or package, check your SQL against the newest reserved words listed in incompatible changes. If upgrading across multiple versions or in case of any problems, check against the full list of Impala keywords.
  • Run the Host Inspector and fix every issue.
  • If using security, run the Security Inspector.
  • Run hdfs fsck / and hdfs dfsadmin -report and fix every issue.
  • Run hbase hbck.
  • Review the upgrade procedure and reserve a maintenance window with enough time allotted to perform all steps. For production clusters, Cloudera recommends allocating up to a full day maintenance window to perform the upgrade, depending on the number of hosts, the amount of experience you have with Hadoop and Linux, and the particular hardware you are using.
  • To avoid lots of alerts during the upgrade process, you can enable maintenance mode on your cluster before you start the upgrade. This will stop email alerts and SNMP traps from being sent, but will not stop checks and configuration validations from being made. Be sure to exit maintenance mode when you have finished the upgrade in order to re-enable Cloudera Manager alerts.

Upgrade Unmanaged Components

Upgrade unmanaged components before proceeding to upgrade managed components. Components that you might have installed that are not managed by Cloudera Manager include:
  • Mahout
  • Pig
  • Whirr

For information on upgrading these unmanaged components, see Upgrading Mahout, Upgrading Pig, and Upgrading Whirr.

Stop Cluster Services

  1. On the Home > Status tab, click to the right of the cluster name and select Stop.
  2. Click Stop in the confirmation screen. The Command Details window shows the progress of stopping services.

    When All services successfully stopped appears, the task is complete and you can close the Command Details window.

Back up Metastore Databases

Back up the Sqoop metastore database.
  1. For each affected service:
    1. If not already stopped, stop the service.
    2. Back up the database. See Backing Up Databases.

Upgrade Managed Components

  1. Download and save the repo file.
    • On Red Hat-compatible systems:

      Click the entry in the table below that matches your Red Hat or CentOS system, go to the repo file for your system and save it in the /etc/yum.repos.d/ directory.

      For OS Version

      Click this Link

      Red Hat/CentOS/Oracle 5

      Red Hat/CentOS/Oracle 5 link

      Red Hat/CentOS 6 (64-bit)

      Red Hat/CentOS 6 link

    • On SLES systems:
      1. Run the following command:
        $ sudo zypper addrepo -f http://archive.cloudera.com/cdh5/sles/11/x86_64/cdh/cloudera-cdh5.repo
      2. Update your system package index by running:
        $ sudo zypper refresh 
    • On Ubuntu and Debian systems:

      Create a new file /etc/apt/sources.list.d/cloudera.list with the following contents:

      • For Ubuntu systems:
        deb [arch=amd64] http://archive.cloudera.com/cdh5/ <OS-release-arch> <RELEASE>-cdh5 contrib deb-src http://archive.cloudera.com/cdh5/ <OS-release-arch> <RELEASE>-cdh5 contrib
      • For Debian systems:
        deb http://archive.cloudera.com/cdh5/ <OS-release-arch> <RELEASE>-cdh5 contrib deb-src http://archive.cloudera.com/cdh5/ <OS-release-arch> <RELEASE>-cdh5 contrib

        where: <OS-release-arch> is debian/wheezy/amd64/cdh or ubuntu/precise/amd64/cdh, and <RELEASE> is the name of your distribution, which you can find by running lsb_release -c.

  2. Edit the repo file to point to the release you want to install or upgrade to.
    • On Red Hat-compatible systems:

      Open the repo file you have just saved and change the 5 at the end of the line that begins baseurl= to the version number you want.

      For example, if you have saved the file for Red Hat 6, it will look like this when you open it for editing:

      [cloudera-cdh5]
      name=Cloudera's Distribution for Hadoop, Version 5
      baseurl=http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/5/
      gpgkey = http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera    
      gpgcheck = 1

      For example, if you want to install CDH 5.1.0, change baseurl=http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/5/ to

      baseurl=http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/5.1.0/

      In this example, the resulting file should look like this:

      [cloudera-cdh5]
      name=Cloudera's Distribution for Hadoop, Version 5
      baseurl=http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/5.1.0/
      gpgkey = http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera    
      gpgcheck = 1
    • On SLES systems:

      Open the repo file that you have just added to your system and change the 5 at the end of the line that begins baseurl= to the version number you want.

      The file should look like this when you open it for editing:

      [cloudera-cdh5]
      name=Cloudera's Distribution for Hadoop, Version 5
      baseurl=http://archive.cloudera.com/cdh5/sles/11/x86_64/cdh/5/
      gpgkey = http://archive.cloudera.com/cdh5/sles/11/x86_64/cdh/RPM-GPG-KEY-cloudera    
      gpgcheck = 1

      For example, if you want to install CDH 5.1.0, change baseurl=http://archive.cloudera.com/cdh5/sles/11/x86_64/cdh/5/ to

      baseurl= http://archive.cloudera.com/cdh5/sles/11/x86_64/cdh/5.1.0/

      In this example, the resulting file should look like this:

      [cloudera-cdh5]
      name=Cloudera's Distribution for Hadoop, Version 5
      baseurl=http://archive.cloudera.com/cdh5/sles/11/x86_64/cdh/5.1.0/
      gpgkey = http://archive.cloudera.com/cdh5/sles/11/x86_64/cdh/RPM-GPG-KEY-cloudera    
      gpgcheck = 1
    • On Ubuntu and Debian systems:

      Replace -cdh5 near the end of each line (before contrib) with the CDH release you need to install. Here are examples using CDH 5.1.0:

      For 64-bit Ubuntu Precise:

      deb [arch=amd64] http://archive.cloudera.com/cdh5/ubuntu/precise/amd64/cdh precise-cdh5.1.0 contrib
      deb-src http://archive.cloudera.com/cdh5/ubuntu/precise/amd64/cdh precise-cdh5.1.0 contrib

      For Debian Wheezy:

      deb http://archive.cloudera.com/cdh5/debian/wheezy/amd64/cdh wheezy-cdh5.1.0 contrib
      deb-src http://archive.cloudera.com/cdh5/debian/wheezy/amd64/cdh wheezy-cdh5.1.0 contrib
  3. (Optionally) add a repository key:
    • Red Hat compatible
      • Red Hat/CentOS/Oracle 5
        $ sudo rpm --import http://archive.cloudera.com/cdh5/redhat/5/x86_64/cdh/RPM-GPG-KEY-cloudera
      • Red Hat/CentOS/Oracle 6
        $ sudo rpm --import http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera
    • SLES
      $ sudo rpm --import http://archive.cloudera.com/cdh5/sles/11/x86_64/cdh/RPM-GPG-KEY-cloudera  
    • Ubuntu and Debian
      • Ubuntu Precise
        $ curl -s http://archive.cloudera.com/cdh5/ubuntu/precise/amd64/cdh/archive.key | sudo apt-key add -
      • Debian Wheezy
        $ curl -s http://archive.cloudera.com/cdh5/debian/wheezy/amd64/cdh/archive.key | sudo apt-key add -
  4. Install the CDH packages:
    • Red Hat compatible
      $ sudo yum clean all
      $ sudo yum install avro-tools crunch flume-ng hadoop-hdfs-fuse hadoop-httpfs hadoop-kms hbase hbase-solr hive-hbase hive-webhcat hue-beeswax hue-hbase hue-impala hue-pig hue-plugins hue-rdbms hue-search hue-spark hue-sqoop hue-zookeeper impala impala-shell kite llama mahout oozie parquet pig pig-udf-datafu search sentry solr solr-mapreduce spark-python sqoop sqoop2 whirr zookeeper
    • SLES
      $ sudo zypper clean --all
      $ sudo zypper install avro-tools crunch flume-ng hadoop-hdfs-fuse hadoop-httpfs hadoop-kms hbase hbase-solr hive-hbase hive-webhcat hue-beeswax hue-hbase hue-impala hue-pig hue-plugins hue-rdbms hue-search hue-spark hue-sqoop hue-zookeeper impala impala-shell kite llama mahout oozie parquet pig pig-udf-datafu search sentry solr solr-mapreduce spark-python sqoop sqoop2 whirr zookeeper
    • Ubuntu and Debian
      $ sudo apt-get update
      $ sudo apt-get install avro-tools crunch flume-ng hadoop-hdfs-fuse hadoop-httpfs hadoop-kms hbase hbase-solr hive-hbase hive-webhcat hue-beeswax hue-hbase hue-impala hue-pig hue-plugins hue-rdbms hue-search hue-spark hue-sqoop hue-zookeeper impala impala-shell kite llama mahout oozie parquet pig pig-udf-datafu search sentry solr solr-mapreduce spark-python sqoop sqoop2 whirr zookeeper

Update Symlinks for the Newly Installed Components

Restart all the Cloudera Manager Agents to force an update of the symlinks to point to the newly installed components on each host:
$ sudo service cloudera-scm-agent restart

Run the Upgrade Wizard

  1. Log into the Cloudera Manager Admin console.
  2. From the Home > Status tab, click next to the cluster name and select Upgrade Cluster. The Upgrade Wizard starts.
  3. In the Choose Method field, click the Use Packages radio button.
  4. In the Choose CDH Version (Packages) field, specify the CDH version of the packages you have installed on your cluster. Click Continue.
  5. Read the notices for steps you must complete before upgrading, click the Yes, I ... checkboxes after completing the steps, and click Continue.
  6. Cloudera Manager checks that hosts have the correct software installed. If the packages have not been installed, a warning displays to that effect. Install the packages and click Check Again. When there are no errors, click Continue.
  7. The Host Inspector runs and displays the CDH version on the hosts. Click Continue.
  8. Choose the type of upgrade and restart:
    • Cloudera Manager upgrade - Cloudera Manager performs all service upgrades and restarts the cluster.
      1. Click Continue. The Command Progress screen displays the result of the commands run by the wizard as it shuts down all services, activates the new parcel, upgrades services as necessary, deploys client configuration files, and restarts services. If any of the steps fails or you click the Abort button the Retry button at the top right is enabled.

        You can click Retry to retry the step and continue the wizard or click the Cloudera Manager logo to return to the Home > Status tab and manually perform the failed step and all following steps.
      2. Click Continue. The wizard reports the result of the upgrade.
    • Manual upgrade - Select the Let me upgrade the cluster checkbox. Cloudera Manager configures the cluster to the specified CDH version but performs no upgrades or service restarts. Manually doing the upgrade is difficult and is for advanced users only.
      1. Click Continue. Cloudera Manager displays links to documentation describing the required upgrade steps.
  9. Click Finish to return to the Home page.

Perform Manual Upgrade or Recover from Failed Steps

The actions performed by the upgrade wizard are listed in Upgrade Wizard Actions. If you chose manual upgrade or any of the steps in the Command Progress screen fails, complete the steps as described in that section before proceeding.

Upgrade Wizard Actions

Do the steps in this section only if you chose a manual upgrade or the upgrade wizard reports a failure and you choose not to retry.

Upgrade the Oozie ShareLib

  1. Go to the Oozie service.
  2. Select Actions > Start and click Start to confirm.
  3. Select Actions > Install Oozie ShareLib and click Install Oozie ShareLib to confirm.

Upgrade Sqoop

  1. Go to the Sqoop service.
  2. Select Actions > Stop and click Stop to confirm.
  3. Select Actions > Upgrade Sqoop and click Upgrade Sqoop to confirm.

Upgrade Spark

  1. Go to the Spark service.
  2. Select Actions > Stop and click Stop to confirm.
  3. Select Actions > Install Spark JAR and click Install Spark JAR to confirm.
  4. Select Actions > Create Spark History Log Dir and click Create Spark History Log Dir to confirm.

Restart All Services

  1. Restart the cluster.

Deploy Client Configuration Files

  1. On the Home page, click to the right of the cluster name and select Deploy Client Configuration.
  2. Click the Deploy Client Configuration button in the confirmation pop-up that appears.