Installing Kafka

Required Role:

Kafka is distributed as a parcel, separate from the CDH parcel. It is also distributed as a package.

Cloudera Manager 5.4 or higher includes the Kafka Custom Service Descriptor (CSD). To install, you download Kafka using CM, then distribute Kafka to the cluster, activate the new parcel, and add the service to the cluster. For a list of available parcels and packages see Cloudera Distribution of Apache Kafka Version and Packaging Information

To install Kafka on a 5.2 or 5.3 cluster, you first install the Kafka CSD, and then use the CSD to install the Kafka parcel. See Installing the Kafka CSD (Cloudera Manager 5.2 and 5.3).

Kafka is supported only on Cloudera Manager 5.2 and higher. Do not use it with lower versions of Cloudera Manager or CDH.

If you installed a Cloudera Labs version of Kafka, you must download a new version. The Cloudera Labs CSD cannot install the GA Kafka parcel.

Cloudera recommends that you deploy Kafka on dedicated hosts that are not used for other cluster roles.

Installing the Kafka CSD (Cloudera Manager 5.2 and 5.3)

Cloudera Manager 5.4 and higher includes the Kafka CSD. Use the built-in CSD. Do not download a different version when using Cloudera Manager 5.4.

If you are using Cloudera Manager 5.2 or 5.3, download the CSD, and then add a new parcel repository to your Cloudera Manager configuration:
  1. Download the Kafka CSD here.
  2. Install the Kafka CSD into Cloudera Manager. See Custom Service Descriptor Files.

Installing Kafka from a Parcel

  1. In Cloudera Manager, download the Kafka parcel, distribute the parcel to the hosts in your cluster, and then activate the parcel. See Managing Parcels. After you activate the Kafka parcel, Cloudera Manager prompts you to restart the cluster. You do not need to restart the cluster after installing Kafka. Click Close to ignore this prompt.
  2. Add the Kafka service to your cluster. See Adding a Service.

Installing Kafka from a Package

You install the Kafka package from the command line.

  1. Navigate to the /etc/repos.d directory.
  2. Use wget to download the Kafka repository. See Cloudera Distribution of Apache Kafka Version and Packaging Information.
  3. Install Kafka using the appropriate commands for your operating system.
    Operating System Commands
    RHEL-compatible
    $ sudo yum clean all
    $ sudo yum install kafka
    $ sudo yum install kafka-server
                                        
    SLES
    $ sudo zypper clean --all
    $ sudo zypper install kafka
    $ sudo zypper install kafka-server
                                        
    Ubuntu or Debian
    $ sudo apt-get update
    $ sudo apt-get install kafka
    $ sudo apt-get install kafka-server
                                
  4. Edit /etc/kafka/conf/server.properties to ensure that the broker.id is unique for each node and broker in Kafka cluster, and zookeeper.connect points to same zookeeper for all nodes and brokers.
  5. Start the Kafka server with the following command:

    $ sudo service kafka-server start.

To verify all nodes are registered to same Zookeeper correctly, connect to Zookeeper using zookeeper-client.

$ zookeeper-client
$ ls /brokers/ids

You should be able to see all of the IDs for the brokers you have registered in your Kafka cluster.

To discover which node a particular ID is assigned, use the following command.

$ get /brokers/ids/<ID>

This command returns the host name of node assigned the ID you specify.