Upgrading Impala

Upgrading Impala involves stopping Impala services, using your operating system's package management tool to upgrade Impala to the latest version, and then restarting Impala services.

Upgrading Impala through Cloudera Manager - Parcels

Parcels are an alternative binary distribution format available in Cloudera Manager 4.5 and higher.

To upgrade Impala for CDH 4 in a Cloudera Managed environment, using parcels:

  1. If you originally installed using packages and now are switching to parcels, remove all the Impala-related packages first. You can check which packages are installed using one of the following commands, depending on your operating system:

    rpm -qa               # RHEL, Oracle Linux, CentOS, Debian
    dpkg --get-selections # Debian
    and then remove the packages using one of the following commands:
    sudo yum remove pkg_names    # RHEL, Oracle Linux, CentOS
    sudo zypper remove pkg_names # SLES
    sudo apt-get purge pkg_names # Ubuntu, Debian
  2. Connect to the Cloudera Manager Admin Console.

  3. Go to the Hosts > Parcels tab. You should see a parcel with a newer version of Impala that you can upgrade to.

  4. Click Download, then Distribute. (The button changes as each step completes.)

  5. Click Activate.

  6. When prompted, click Restart to restart the Impala service.

Upgrading Impala through Cloudera Manager - Packages

To upgrade Impala in a Cloudera Managed environment, using packages:

  1. Connect to the Cloudera Manager Admin Console.
  2. In the Services tab, click the Impala service.
  3. Click Actions and click Stop.
  4. Use one of the following sets of commands to update Impala on each Impala node in your cluster:

    For RHEL, Oracle Linux, or CentOS systems:

    $ sudo yum update impala
    $ sudo yum update hadoop-lzo-cdh4 # Optional; if this package is already installed
    

    For SUSE systems:

    $ sudo zypper update impala
    $ sudo zypper update hadoop-lzo-cdh4 # Optional; if this package is already installed
    

    For Debian or Ubuntu systems:

    $ sudo apt-get install impala
    $ sudo apt-get install hadoop-lzo-cdh4 # Optional; if this package is already installed
    
  5. Use one of the following sets of commands to update Impala shell on each node on which it is installed:

    For RHEL, Oracle Linux, or CentOS systems:

    $ sudo yum update impala-shell

    For SUSE systems:

    $ sudo zypper update impala-shell

    For Debian or Ubuntu systems:

    $ sudo apt-get install impala-shell
  6. Connect to the Cloudera Manager Admin Console.
  7. In the Services tab, click the Impala service.
  8. Click Actions and click Start.

Upgrading Impala without Cloudera Manager

To upgrade Impala on a cluster not managed by Cloudera Manager, run these Linux commands on the appropriate hosts in your cluster:

  1. Stop Impala services.
    1. Stop impalad on each Impala node in your cluster:
      $ sudo service impala-server stop
    2. Stop any instances of the state store in your cluster:
      $ sudo service impala-state-store stop
    3. Stop any instances of the catalog service in your cluster:
      $ sudo service impala-catalog stop
  2. Check if there are new recommended or required configuration settings to put into place in the configuration files, typically under /etc/impala/conf. See Post-Installation Configuration for Impala for settings related to performance and scalability.
  3. Use one of the following sets of commands to update Impala on each Impala node in your cluster:

    For RHEL, Oracle Linux, or CentOS systems:

    $ sudo yum update impala-server
    $ sudo yum update hadoop-lzo-cdh4 # Optional; if this package is already installed
    $ sudo yum update impala-catalog # New in Impala 1.2; do yum install when upgrading from 1.1.
    

    For SUSE systems:

    $ sudo zypper update impala-server
    $ sudo zypper update hadoop-lzo-cdh4 # Optional; if this package is already installed
    $ sudo zypper update impala-catalog # New in Impala 1.2; do zypper install when upgrading from 1.1.
    

    For Debian or Ubuntu systems:

    $ sudo apt-get install impala-server
    $ sudo apt-get install hadoop-lzo-cdh4 # Optional; if this package is already installed
    $ sudo apt-get install impala-catalog # New in Impala 1.2.
    
  4. Use one of the following sets of commands to update Impala shell on each node on which it is installed:

    For RHEL, Oracle Linux, or CentOS systems:

    $ sudo yum update impala-shell

    For SUSE systems:

    $ sudo zypper update impala-shell

    For Debian or Ubuntu systems:

    $ sudo apt-get install impala-shell
  5. Depending on which release of Impala you are upgrading from, you might find that the symbolic links /etc/impala/conf and /usr/lib/impala/sbin are missing. If so, see Impala Known Issues for the procedure to work around this problem.
  6. Restart Impala services:
    1. Restart the Impala state store service on the desired nodes in your cluster. Expect to see a process named statestored if the service started successfully.
      $ sudo service impala-state-store start
      $ ps ax | grep [s]tatestored
       6819 ?        Sl     0:07 /usr/lib/impala/sbin/statestored -log_dir=/var/log/impala -state_store_port=24000
      

      Restart the state store service before the Impala server service to avoid "Not connected" errors when you run impala-shell.

    2. Restart the Impala catalog service on whichever host it runs on in your cluster. Expect to see a process named catalogd if the service started successfully.
      $ sudo service impala-catalog restart
      $ ps ax | grep [c]atalogd
       6068 ?        Sl     4:06 /usr/lib/impala/sbin/catalogd
      
    3. Restart the Impala daemon service on each node in your cluster. Expect to see a process named impalad if the service started successfully.
      $ sudo service impala-server start
      $ ps ax | grep [i]mpalad
       7936 ?        Sl     0:12 /usr/lib/impala/sbin/impalad -log_dir=/var/log/impala -state_store_port=24000 -use_statestore
      -state_store_host=127.0.0.1 -be_port=22000