This is the documentation for Cloudera Manager 5.1.x. Documentation for other versions is available at Cloudera Documentation.

Upgrading CDH 4

You can upgrade to CDH 4.1.3 (or later) within the Cloudera Manager Admin Console, using parcels and an upgrade wizard. This vastly simplifies the upgrade process. In addition, using parcels enables Cloudera Manager to automate the deployment of CDH versions. Electing to upgrade using packages means that future upgrades will still need to be done manually. Upgrading to a CDH 4 release prior to CDH 4.1.3 is possible using packages, though upgrading to a more current release is strongly recommended.

If you have a Cloudera Enterprise license and are running HDFS high availability, you can perform a rolling upgrade that lets you avoid cluster downtime.

  Important:

The following instructions describe how to upgrade from a CDH 4 release to a newer CDH 4 release in a Cloudera Managed Deployment. If you are running CDH 3, you must upgrade to CDH 4 using the instructions at Upgrading CDH 3 to CDH 4 in a Cloudera Managed Deployment.

To upgrade from CDH 4 to CDH 5, see Upgrading from CDH 4 to CDH 5 Parcels.

Before You Begin

  • Before upgrading, be sure to read about the latest Incompatible Changes and Known Issues and Workarounds in the CDH 4 Release Notes.
  • Read the Cloudera Manager 5 Release Notes.
  • Make sure there are no Oozie workflows in RUNNING or SUSPENDED status; otherwise the Oozie database upgrade will fail and you will have to reinstall CDH 4 to complete or kill those running workflows.
  • Run the Host Inspector and fix every issue.
  • If using security, run the Security Inspector.
  • Run hdfs fsck / and hdfs dfsadmin -report and fix any issues.
  • If using HBase:
    • Run hbase hbck to make sure there are no inconsistencies.
    • Before you can upgrade HBase from CDH 4 to CDH 5, your HFiles must be upgraded from HFile v1 format to HFile v2, because CDH 5 no longer supports HFile v1. The upgrade procedure itself is different if you are using Cloudera Manager or the command line, but has the same results. The first step is to check for instances of HFile v1 in the HFiles and mark them to be upgraded to HFile v2, and to check for and report about corrupted files or files with unknown versions, which need to be removed manually. The next step is to rewrite the HFiles during the next major compaction. After the HFiles are upgraded, you can continue the upgrade. To check and upgrade the files:
      1. In the Cloudera Admin Console, go to the HBase service and run Actions > Check HFile Version.
      2. Check the output of the command in the stderr log.
        Your output should be similar to the following:
        Tables Processed:
        hdfs://localhost:41020/myHBase/.META.
        hdfs://localhost:41020/myHBase/usertable
        hdfs://localhost:41020/myHBase/TestTable
        hdfs://localhost:41020/myHBase/t
        
        Count of HFileV1: 2
        HFileV1:
        hdfs://localhost:41020/myHBase/usertable /fa02dac1f38d03577bd0f7e666f12812/family/249450144068442524
        hdfs://localhost:41020/myHBase/usertable /ecdd3eaee2d2fcf8184ac025555bb2af/family/249450144068442512
        
        Count of corrupted files: 1
        Corrupted Files:
        hdfs://localhost:41020/myHBase/usertable/fa02dac1f38d03577bd0f7e666f12812/family/1
        Count of Regions with HFileV1: 2
        Regions to Major Compact:
        hdfs://localhost:41020/myHBase/usertable/fa02dac1f38d03577bd0f7e666f12812
        hdfs://localhost:41020/myHBase/usertable/ecdd3eaee2d2fcf8184ac025555bb2af
        In the example above, you can see that the script has detected two HFile v1 files, one corrupt file and the regions to major compact.
      3. Trigger a major compaction on each of the reported regions. This major compaction rewrites the files from HFile v1 to HFile v2 format. To run the major compaction, start HBase Shell and issue the major_compact command.
        $ bin/hbase shell
        hbase> major_compact 'usertable'
        You can also do this in a single step by using the echo shell built-in command.
        $ echo "major_compact 'usertable'" | bin/hbase shell
  • Review the upgrade procedure and reserve a maintenance window with enough time allotted to perform all steps. For production clusters, Cloudera recommends allocating up to a full day maintenance window to perform the upgrade, depending on the number of hosts, the amount of experience you have with Hadoop and Linux, and the particular hardware you are using.
  • To avoid generating many alerts during the upgrade process, you can enable maintenance mode on your cluster before you start the upgrade. Be sure to exit maintenance mode when you have finished the upgrade, in order to re-enable Cloudera Manager alerts.

Upgrade Procedures

  Important:
  • Impala - If you have CDH 4.1.x with Cloudera Impala installed, and you plan to upgrade to CDH 4.2 or later, you must also upgrade Impala to version 1.2.1 or later. With a parcel installation you can download and activate both parcels before you proceed to restart the cluster. You will need to change the remote parcel repo URL to point to the location of the released product as described in the upgrade procedures referenced below.
  • HBase - In CDH 4.1.x, an HBase table could have an owner that had full administrative permissions on the table. The owner construct was removed as of CDH 4.2.0, and the code now relies exclusively on entries in the ACL table. Since table owners do not have an entry in this table, their permissions are removed on upgrade from CDH 4.1.x to CDH 4.2.0 or later. If you are upgrading from CDH 4.1.x to CDH 4.2 or later, and using HBase, you must add permissions for HBase owner users to the HBase ACL table before you perform the upgrade. See the Known Issues in the CDH 4 Release Notes, specifically the item "Must explicitly add permissions for owner users before upgrading from 4.1.x" in the Known Issues in Apache HBase section.
  • Hive - Hive has undergone major version changes from CDH 4.0 to 4.1 and between CDH 4.1 and 4.2. (CDH 4.0 had Hive 0.8.0, CDH 4.1 used Hive 0.9.0, and 4.2 or later has 0.10.0). This requires you to manually back up and upgrade the Hive metastore database when upgrading between major Hive versions. If you are upgrading from a version of CDH 4 prior to CDH 4.2 to a newer CDH 4 version, you must follow the steps for upgrading the metastore included in the upgrade procedures referenced below.
Page generated September 3, 2015.