Upgrading CDH 4 Using Packages
If you originally used Cloudera Manager to install your CDH service using packages, you can upgrade to a higher minor version of CDH 4 either using packages or parcels. Parcels is the preferred and recommended way to upgrade, as the upgrade wizard provided for parcels handles the upgrade process almost completely automatically.
- Impala - If you have CDH 4.1.x with Cloudera Impala installed, and you plan to upgrade to CDH 4.2 or later, you must also upgrade Impala to version 1.2.1 or later. With a parcel installation you can download and activate both parcels before you proceed to restart the cluster. You will need to change the remote parcel repo URL to point to the location of the released product as described in the upgrade procedures referenced below.
- HBase - In CDH 4.1.x, an HBase table could have an owner that had full administrative permissions on the table. The owner construct was removed as of CDH 4.2.0, and the code now relies exclusively on entries in the ACL table. Since table owners do not have an entry in this table, their permissions are removed on upgrade from CDH 4.1.x to CDH 4.2.0 or later. If you are upgrading from CDH 4.1.x to CDH 4.2 or later, and using HBase, you must add permissions for HBase owner users to the HBase ACL table before you perform the upgrade. See the Known Issues in the CDH 4 Release Notes, specifically the item "Must explicitly add permissions for owner users before upgrading from 4.1.x" in the Known Issues in Apache HBase section.
- Hive - Hive has undergone major version changes from CDH 4.0 to 4.1 and between CDH 4.1 and 4.2. (CDH 4.0 had Hive 0.8.0, CDH 4.1 used Hive 0.9.0, and 4.2 or later has 0.10.0). This requires you to manually back up and upgrade the Hive metastore database when upgrading between major Hive versions. If you are upgrading from a version of CDH 4 prior to CDH 4.2 to a newer CDH 4 version, you must follow the steps for upgrading the metastore included in the upgrade procedures referenced below.
To upgrade your version of CDH using packages, the steps are as follows.
- Before You Begin
- Upgrade Unmanaged Components
- Upgrade Managed Components
- Upgrade the Hive Metastore Database
- Upgrade the Oozie ShareLib
- Upgrade Sqoop
- Restart the Services
- Configure Cluster CDH Version for Package Installs
- Deploy Client Configuration Files
Before You Begin
- Before upgrading, be sure to read about the latest Incompatible Changes and Known Issues and Workarounds in the CDH 4 Release Notes.
- Read the Cloudera Manager 5 Release Notes.
- Ensure that the Cloudera Manager minor version is equal to or greater
than the CDH minor version.
Target CDH Version Minimum Cloudera Manager Version 5.0.5 5.0.x 5.1.4 5.1.x 5.2.2 5.2.x
- Run the Host Inspector and fix every issue.
- If using security, run the Security Inspector.
- Check your SQL against new Impala keywords whenever upgrading Impala, whether Impala is in CDH or a standalone parcel or package.
- Run hdfs fsck / and hdfs dfsadmin -report and fix every issue.
- Run hbase hbck.
- Review the upgrade procedure and reserve a maintenance window with enough time allotted to perform all steps. For production clusters, Cloudera recommends allocating up to a full day maintenance window to perform the upgrade, depending on the number of hosts, the amount of experience you have with Hadoop and Linux, and the particular hardware you are using.
- To avoid lots of alerts during the upgrade process, you can enable maintenance mode on your cluster before you start the upgrade. This will stop email alerts and SNMP traps from being sent, but will not stop checks and configuration validations from being made. Be sure to exit maintenance mode when you have finished the upgrade in order to re-enable Cloudera Manager alerts.
Upgrade Unmanaged Components
Upgrading unmanaged components is a process that is separate from upgrading managed components. Upgrade the unmanaged components before proceeding to upgrade managed components. Components that you might have installed that are not managed by Cloudera Manager include:
For information on upgrading these unmanaged components, see CDH 4 Installation Guide.
Upgrade Managed ComponentsUse one of the following strategies to upgrade CDH 4:
- Use your operating system's package management tools to update all packages to
the latest version using standard repositories. This approach works
well because it minimizes the amount of configuration required and
uses the simplest commands. Be aware that this can take a considerable
amount of time if you have not upgraded the system recently. To update
all packages on your system, use the following command:
Operating System Command
$ sudo yum update
$ sudo zypper up
Ubuntu or Debian
$ sudo apt-get upgrade
- Use the cloudera.com repository that is added during a typical
installation, only updating Cloudera components. This limits the scope
of updates to be completed, so the process takes less time, however
this process will not work if you created and used a custom
repository. To install the new version, you can upgrade from
Cloudera's repository by adding an entry to your operating system's
package management configuration file. The repository location varies
by operating system:
Operating System Configuration File Repository Entry Red Hat http://archive.cloudera.com/cdh4/redhat/6/x86_64/cdh/4/ SLES http://archive.cloudera.com/cdh4/sles/11/x86_64/cdh/4/ Debian Squeeze [arch=amd64] http://archive.cloudera.com/cdh4/debian/squeeze squeeze-cdh4 contrib Ubuntu Lucid [arch=amd64] http://archive.cloudera.com/cdh4/ubuntu/lucid/amd64/cdh lucid-cdh4 contrib Ubuntu Precise [arch=amd64] http://archive.cloudera.com/cdh4/ubuntu/precise/amd64/cdh precise-cdh4 contrib
For example, under Red Hat, to upgrade from Cloudera's repository you can run commands such as the following on the CDH host to update only CDH:
$ sudo yum clean all $ sudo yum update 'cloudera-*'Note
- cloudera-cdh4 is the name of the repository on your system;
the name is usually in square brackets on the first line of the
repo file, in this example
[chris@ca727 yum.repos.d]$ more cloudera-cdh4.repo [cloudera-cdh4] ...
- yum clean all cleans up yum's cache directories, ensuring that you download and install the latest versions of the packages. – If your system is not up to date, and any underlying system components need to be upgraded before this yum update can succeed, yum will tell you what those are.
On a SLES system, use commands like this to clean cached repository information and then update only the CDH components. For example:
$ sudo zypper clean --all $ sudo zypper up -r http://archive.cloudera.com/cdh4/sles/11/x86_64/cdh/4
To verify the URL, open the Cloudera repo file in /etc/zypp/repos.d on your system (for example /etc/zypp/repos.d/cloudera-cdh4.repo) and look at the line beginning
Use that URL in your sudo zypper up -r command.
On a Debian/Ubuntu system, use commands like this to clean cached repository information and then update only the CDH components. First:
$ sudo apt-get clean
After cleaning the cache, use one of the following upgrade commands to upgrade CDH.
$ sudo apt-get upgrade -t precise-cdh4
$ sudo apt-get upgrade -t lucid-cdh4
$ sudo apt-get upgrade -t squeeze-cdh4
- cloudera-cdh4 is the name of the repository on your system; the name is usually in square brackets on the first line of the repo file, in this example /etc/yum.repos.d/cloudera-cdh4.repo:
- Use a custom repository. This process can be more complicated, but enables
updating CDH components for hosts that are not connected to the
Internet. You can create your own repository, as described in
Understanding Custom Installation Solutions. Creating
your own repository is necessary if you are upgrading a cluster that
does not have access to the Internet.
If you used a custom repository to complete the installation of your current files and now you want to update using a custom repository, the details of the steps to complete the process are variable. In general, begin by updating any existing custom repository that you will use with the installation files you wish to use. This can be completed in a variety of ways. For example, you might use wget to copy the necessary installation files. Once the installation files have been updated, use the custom repository you established for the initial installation to update CDH.
OS Command RHEL Ensure you have a custom repo that is configured to use your internal repository. For example, if you could have custom repo file in /etc/yum.conf.d/ called cdh_custom.repo in which you specified a local repository. In such a case, you might use the following commands:
$ sudo yum clean all $ sudo yum update 'cloudera-*'
SLES Use commands such as the following to clean cached repository information and then update only the CDH components:
$ sudo zypper clean --all $ sudo zypper up -r http://internalserver.example.com/path_to_cdh_repo
Ubuntu or Debian Use a command that targets upgrade of your CDH distribution using the custom repository specified in your apt configuration files. These files are typically either the /etc/apt/apt.conf file or in various files in the /etc/apt/apt.conf.d/ directory. Information about your custom repository must be included in the repo files. The general form of entries in Debian/Ubuntu is:
deb http://server.example.com/directory/ dist-name pool
For example, the entry for the default repo is:
deb http://us.archive.ubuntu.com/ubuntu/ precise universe
On a Debian/Ubuntu system, use commands such as the following to clean cached repository information and then update only the CDH components:
$ sudo apt-get clean $ sudo apt-get upgrade -t your_cdh_repo
Upgrade the Hive Metastore Database
Required if you are upgrading from an earlier version of CDH 4 to CDH 4.2 or later.
- Go to the Hive service.
- Select Stop to confirm. and click
- Select Upgrade Hive Metastore Database Schema to confirm. and click
- If you have multiple instances of Hive, perform the upgrade on each metastore database.
Upgrade the Oozie ShareLib
- Go to the Oozie service.
- Select Start to confirm. and click
- Select Install Oozie ShareLib to confirm. and click
- Go to the Sqoop service.
- Select Stop to confirm. and click
- Select Upgrade Sqoop to confirm. and click
Restart the Services
- On the Home page, click to the right of the cluster name and select Restart.
- Click the Restart button in the confirmation pop-up that appears. The Command Details window shows the progress of starting services.
Configure Cluster CDH Version for Package Installs
If you have installed CDH as a package, after an installation or upgrade, make sure that the cluster CDH version matches the package CDH version, using the procedure in Configuring the CDH Version of a Cluster. If the cluster CDH version does not match the package CDH version, Cloudera Manager incorrectly enables and disables service features based on the cluster's configured CDH version.
Deploy Client Configuration Files
- On the Home page, click to the right of the cluster name and select Deploy Client Configuration.
- Click the Deploy Client Configuration button in the confirmation pop-up that appears.
|<< Upgrading CDH 4 Using Parcels||Upgrading CDH 3 >>|