Installing and Upgrading Apache Kudu

You can install Apache Kudu in a cluster managed by Cloudera Manager, using either parcels or packages. If you do not use Cloudera Manager, you can install Kudu using packages.

Kudu Installation Requirements

  • Hardware
    • One or more hosts to run Kudu masters. You should have either one master (provides no fault tolerance), three masters (can tolerate one failure), or five masters (can tolerate two failures).
    • One or more hosts to run Kudu tablet servers. With replication, a minimum of three tablet servers is necessary.
  • Operating systems
    • Linux
      • RHEL/CentOS 6.4, 6.5, 6.6, 6.7, 6.8, 7.1, 7.2, 7.3
      • Oracle Linux (OL) 6.4, 6.5, 6.6, 6.7, 6.8, 7.1, 7.2, 7.3
      • Ubuntu 14.04 (Trusty), 16.04 (Xenial)
      • Debian 8.2, 8.4 (Jessie)
      • SLES 12 Service Pack 1
      • A kernel version and filesystem that support hole punching. Hole punching is the use of the fallocate(2) system call with the FALLOC_FL_PUNCH_HOLE option set. See Error during hole punch test. If you cannot meet this requirement, see this workaround.
      • NTP
    • MacOS
      • OS X 10.10 Yosemite, OS X 10.11 El Capitan, and macOS Sierra.
      • Pre-built macOS packages are not provided.
    • Windows
      • Microsoft Windows is not supported.
  • Management - To manage Kudu with Cloudera Manager, Cloudera Manager 5.10.0 or later and CDH 5.10.0 or later are required.
  • Storage - If solid state storage is available, storing Kudu WALs on such high-performance media may significantly improve latency when Kudu is configured for its highest durability levels.

Install Kudu Using Cloudera Manager

To install and manage Kudu using Cloudera Manager, first download the Custom Service Descriptor (CSD) file for Kudu and upload it to /opt/cloudera/csd/ on the Cloudera Manager server. Restart the Cloudera Manager server using the following operating system command.
$ sudo service cloudera-scm-server restart

Next, follow the instructions in either Install Kudu Using Parcels or Install Kudu Using Packages.

Install Kudu Using Parcels

After uploading the CSD file for Kudu and restarting the Cloudera Manager server, follow these steps to install Kudu using parcels.
  1. In Cloudera Manager, go to Hosts > Parcels. Find KUDU in the list, and click Download.
  2. When the download is complete, select your cluster from the Locations selector, and click Distribute. If you only have one cluster, it is selected automatically.
  3. When distribution is complete, click Activate to activate the parcel. Restart the cluster when prompted. This may take several minutes.
  4. Install the Kudu service on your cluster. Go to the cluster where you want to install Kudu. Click Actions > Add a Service. Select Kudu from the list, and click Continue.
  5. Select a host for the master role and one or more hosts for the tablet server roles. A host can act as both a master and a tablet server, but this may cause performance problems on a large cluster. The Kudu master process is not resource-intensive and can be collocated with other similar processes such as the HDFS Namenode or YARN ResourceManager. After selecting hosts, click Continue.
  6. Configure the storage locations for Kudu data and write-ahead log (WAL) files on masters and tablet servers. Cloudera Manager will create the directories.
    • You can use the same directory to store data and WALs.
    • You cannot store WALs in a subdirectory of the data directory.
    • If any host is both a master and tablet server, configure different directories for master and tablet server. For instance, /data/kudu/master and /data/kudu/tserver.
    • If you choose a filesystem that does not support hole punching, service start-up will fail. Only if service start-up fails for this reason, exit the wizard by clicking the Cloudera logo at the top left, and enable the file block manager. This is not appropriate for production. See Enabling the File Block Manager.
  7. If your filesystem does support hole punching, do not exit the wizard. Click Continue. Kudu masters and tablet servers are started. Otherwise, go to the Kudu service, and click Actions > Start.
  8. Verify that services are running using one of the following methods:
    • Examine the output of the ps command on servers to verify one or both of kudu-master or kudu-tserver processes is running.
    • Access the master or tablet server web UI by opening the URL in your web browser. The URL is http://<_host_name_>:8051/ for masters or http://<_host_name_>:8050/ for tablet servers.
  9. Restart the Service Monitor to begin generating health checks and charts for Kudu. Go to the Cloudera Manager service and click Service Monitor. Choose Actions > Restart.
  10. To manage roles, go to the Kudu service and use the Actions menu to stop, start, restart, or otherwise manage the service.

Enabling the File Block Manager

If your underlying filesystem does not support hole punching, Kudu will not start unless you enable the file block manager. This is not appropriate for production systems. If your filesystem does support hole punching, there is no reason to use the file block manager.

  1. If you are still in the Cloudera configuration wizard, exit the configuration wizard by clicking the Cloudera logo at the top of the Cloudera Manager interface.
  2. Go to the Kudu service.
  3. Go to Configuration and search for the Kudu Service Advanced Configuration Snippet (Safety Valve) for gflagfile configuration option.
  4. Add the following line to it, and save your changes:
    --block_manager=file

Install Kudu Using Packages

If you use packages with Cloudera Manager, follow these instructions after uploading the CSD file for Kudu and restarting the Cloudera Manager server.
Kudu Repository and Package Links
Operating System Repository Package Individual Packages
RHEL RHEL 6 or RHEL 7 RHEL 6
Ubuntu Trusty, Xenial Trusty, Xenial
SLES SLES 12 SLES 12
Debian Jessie Jessie
  1. Cloudera recommends installing the Kudu repositories for your operating system. Use the links in Kudu Repository and Package Links to download the appropriate repository installer. Save the repository installer to /etc/yum.repos.d/ for RHEL, /etc/apt/sources.list.d/ for Ubuntu/Debian, or /etc/zypp/repos.d for SLES.
    • If you use Cloudera Manager, you only need to install the kudu package:
      $ sudo yum install kudu
      $ sudo apt-get install kudu
    • If you need the C++ client development libraries or the Kudu SDK, install kudu-client and kudu-client-devel packages for RHEL, or libkuduclient0 and libkuduclient-dev packages for Ubuntu.
    • If you use Cloudera Manager, do not install the kudu-master and kudu-tserver packages, which provide operating system startup scripts for using Kudu without Cloudera Manager.
  2. Install the Kudu service on your cluster. Go to the cluster where you want to install Kudu. Click Actions > Add a Service. Select Kudu from the list, and click Continue.
  3. Select a host for the master role and one or more hosts for the tablet server roles. A host can act as both a master and a tablet server, but this may cause performance problems on a large cluster. The Kudu master process is not resource-intensive and can be collocated with other similar processes such as the HDFS Namenode or YARN ResourceManager. After selecting hosts, click Continue.
  4. Configure the storage locations for Kudu data and write-ahead log (WAL) files on masters and tablet servers. Cloudera Manager will create the directories.
    • You can use the same directory to store data and WALs.
    • You cannot store WALs in a subdirectory of the data directory.
    • If any host is both a master and tablet server, configure different directories for master and tablet server. For instance, /data/kudu/master and /data/kudu/tserver.
    • If you choose a filesystem that does not support hole punching, service start-up will fail. Only if service start-up fails for this reason, exit the wizard by clicking the Cloudera logo at the top left, and enable the file block manager. This is not appropriate for production. See Enabling the File Block Manager.
  5. If your filesystem does support hole punching, do not exit the wizard. Click Continue. Kudu masters and tablet servers are started. Otherwise, go to the Kudu service, and click Actions > Start.
  6. Verify that services are running using one of the following methods:
    • Examine the output of the ps command on servers to verify one or both of kudu-master or kudu-tserver processes is running.
    • Access the master or tablet server web UI by opening the URL in your web browser. The URL is http://<_host_name_>:8051/ for masters or http://<_host_name_>:8050/ for tablet servers.
  7. To manage roles, go to the Kudu service and use the Actions menu to stop, start, restart, or otherwise manage the service.
  8. Restart the Service Monitor to begin generating health checks and charts for Kudu. Go to the Cloudera Manager service and click Service Monitor. Choose Actions > Restart.

Install Kudu Using the Command Line

Follow these steps on each node which will participate in your Kudu cluster.

  1. Cloudera recommends installing the Kudu repositories for your operating system. Use the links in Kudu Repository and Package Links to download the appropriate repository installer.
    • Install the kudu package, using the appropriate commands for your operating system:
      $ sudo yum install kudu
      $ sudo apt-get install kudu
    • If you need the C++ client development libraries or the Kudu SDK, install kudu-client and kudu-client-devel packages for RHEL, or libkuduclient0 and libkuduclient-dev packages for Ubuntu.
    • Install the kudu-master and kudu-tserver packages, which provide operating system start-up scripts for the Kudu master and tablet servers.
  2. The packages create a kudu-conf entry in the operating system's alternatives database, and they ship the built-in conf.dist alternative. To adjust your configuration, you can either edit the files in /etc/kudu/conf/ directly, or create a new alternative using the operating system utilities. If you create a new alternative, make sure the alternative is the directory pointed to by the /etc/kudu/conf/ symbolic link, and create custom configuration files there. Some parts of the configuration are configured in /etc/default/kudu-master and /etc/default/kudu-tserver files as well. You should include or duplicate these configuration options if you create custom configuration files.

    Review the configuration, including the default WAL and data directory locations, and adjust them according to your requirements.

  3. Start Kudu services using the following commands on the appropriate nodes:
    $ sudo service kudu-master start
    $ sudo service kudu-tserver start
  4. To stop Kudu services, use the following commands:
    $ sudo service kudu-master stop 
    $ sudo service kudu-tserver stop
  5. Configure the Kudu services to start automatically when the server starts, by adding them to the default runlevel.
    $ sudo chkconfig kudu-master on            # RHEL / CentOS 
    $ sudo chkconfig kudu-tserver on           # RHEL / CentOS 
    
    $ sudo update-rc.d kudu-master defaults    # Ubuntu 
    $ sudo update-rc.d kudu-tserver defaults   # Ubuntu

Upgrading Kudu

Before upgrading Kudu, make sure read the Release Notes relevant to the version you are upgrading to.

Upgrading Kudu Using Parcels

To upgrade Kudu, use the following instructions if you use Cloudera Manager. If you do not use Cloudera Manager, see the instructions for upgrading Kudu packages.

  1. First, download the Custom Service Descriptor (CSD) file for Kudu and upload it to /opt/cloudera/csd/ on the Cloudera Manager server. Restart the Cloudera Manager server using the following operating system command.
    $ sudo service cloudera-scm-server restart
  2. Go to Hosts. Click Parcels.
  3. Click Check For New Parcels.
  4. Find the new version of Kudu in the list of parcels. Download, distribute, and activate it on your cluster.

Upgrade Kudu Using Packages

To upgrade Kudu, use the following instructions if you use Cloudera Manager. If you do not use Cloudera Manager, see the instructions for upgrading Kudu packages.

Using RHEL:

  1. First, download the Custom Service Descriptor (CSD) file for Kudu and upload it to /opt/cloudera/csd/ on the Cloudera Manager server. Restart the Cloudera Manager server using the following operating system command.
    $ sudo service cloudera-scm-server restart
  2. Stop the Kudu service in Cloudera Manager. Go to the Kudu service and select Actions > Stop.
  3. Issue the following commands at the command line on each Kudu host:
    $ sudo yum -y clean all
    $ sudo yum -y upgrade kudu
  4. Start the Kudu service in Cloudera Manager. Go to the Kudu service and select Actions > Start.

Using Ubuntu:

  1. First, download the Custom Service Descriptor (CSD) file for Kudu and upload it to /opt/cloudera/csd/ on the Cloudera Manager server. Restart the Cloudera Manager server using the following operating system command.
    $ sudo service cloudera-scm-server restart
  2. If you use a repository, re-download the repository list file to ensure that you have the latest information. See Kudu Repository and Package Links.
  3. Stop the Kudu service in Cloudera Manager. Go to the Kudu service and select Actions > Stop.
  4. Issue the following commands at the command line on each Kudu host:
    $ sudo apt-get update
    $ sudo apt-get install kudu
  5. Start the Kudu service in Cloudera Manager. Go to the Kudu service and select Actions > Start.

Next Steps

Read about Using Apache Impala (incubating) with Kudu.

For more information about using Kudu, go to the Kudu project page, where you can find official documentation, links to the Github repository and examples, and other resources.

For a reading list and other helpful links, refer to More Resources for Apache Kudu.