Installing Kafka

Kafka is distributed in a parcel that is independent of the CDH parcel and integrates with Cloudera Manager using a Custom Service Descriptor (CSD).

To install Apache Kafka:

  1. Download the Kafka CSD here.
  2. Install the CSD into Cloudera Manager as instructed in Custom Service Descriptor Files. This adds a new parcel repository to your Cloudera Manager configuration. The CSD can only be installed on parcel-deployed clusters.
  3. Download, distribute, and activate the Kafka parcel, following the instructions in Managing Parcels. After you activate the Kafka parcel, Cloudera Manager prompts you to restart the cluster. Click the Close button to ignore this prompt. You do not need to restart the cluster after installing Kafka.
  4. Add the Kafka service to your cluster, following the instructions in Adding a Service.

Cloudera strongly recommends that you deploy Kafka on dedicated hosts that are not used for other cluster roles.

Kafka Command-line Tools

Important Kafka command-line tools are located in /usr/bin:
  • kafka-topics
    Create, alter, list, and describe topics. For example:
    $ /usr/bin/kafka-topics --list --zookeeper zk01.example.com:2181
    sink1
    t1
    t2
  • kafka-console-consumer
    Read data from a Kafka topic and write it to standard output. For example:
    $ /usr/bin/kafka-console-consumer --zookeeper zk01.example.com:2181 --topic t1
  • kafka-console-producer
    Read data from standard output and write it to a Kafka topic. For example:
    $ /usr/bin/kafka-console-producer --broker-list kafka02.example.com:9092,kafka03.example.com:9092 --topic t1
  • kafka-consumer-offset-checker
    Check the number of messages read and written, as well as the lag for each consumer in a specific consumer group. For example:
    $ /usr/bin/kafka-consumer-offset-checker --group flume --topic t1 --zookeeper zk01.example.com:2181

Logs

The Kafka parcel is configured to log all Kafka log messages to a single file, /var/log/kafka/server.log by default. You can view, filter, and search this log using Cloudera Manager.

For debugging purposes, you can create a separate file with TRACE level logs of a specific component (such as the controller) or the state changes.

To do so, use the Kafka broker Logging Advanced Configuration Snippet (Safety Valve) field in Cloudera Manager (Kafka Service > Configuration > Kafka broker Default Group > Advanced) to add new appenders to log4j. For example, to restore the default Apache Kafka log4j configuration, copy the following into the safety valve:
log4j.appender.kafkaAppender=org.apache.log4j.DailyRollingFileAppender
log4j.appender.kafkaAppender.DatePattern='.'yyyy-MM-dd-HH
log4j.appender.kafkaAppender.File=${log.dir}/kafka_server.log
log4j.appender.kafkaAppender.layout=org.apache.log4j.PatternLayout
log4j.appender.kafkaAppender.layout.ConversionPattern=[%d] %p %m (%c)%n

log4j.appender.stateChangeAppender=org.apache.log4j.DailyRollingFileAppender
log4j.appender.stateChangeAppender.DatePattern='.'yyyy-MM-dd-HH
log4j.appender.stateChangeAppender.File=${log.dir}/state-change.log
log4j.appender.stateChangeAppender.layout=org.apache.log4j.PatternLayout
log4j.appender.stateChangeAppender.layout.ConversionPattern=[%d] %p %m (%c)%n

log4j.appender.requestAppender=org.apache.log4j.DailyRollingFileAppender
log4j.appender.requestAppender.DatePattern='.'yyyy-MM-dd-HH
log4j.appender.requestAppender.File=${log.dir}/kafka-request.log
log4j.appender.requestAppender.layout=org.apache.log4j.PatternLayout
log4j.appender.requestAppender.layout.ConversionPattern=[%d] %p %m (%c)%n

log4j.appender.cleanerAppender=org.apache.log4j.DailyRollingFileAppender
log4j.appender.cleanerAppender.DatePattern='.'yyyy-MM-dd-HH
log4j.appender.cleanerAppender.File=${log.dir}/log-cleaner.log
log4j.appender.cleanerAppender.layout=org.apache.log4j.PatternLayout
log4j.appender.cleanerAppender.layout.ConversionPattern=[%d] %p %m (%c)%n

log4j.appender.controllerAppender=org.apache.log4j.DailyRollingFileAppender
log4j.appender.controllerAppender.DatePattern='.'yyyy-MM-dd-HH
log4j.appender.controllerAppender.File=${log.dir}/controller.log
log4j.appender.controllerAppender.layout=org.apache.log4j.PatternLayout
log4j.appender.controllerAppender.layout.ConversionPattern=[%d] %p %m (%c)%n

# Turn on all our debugging info
#log4j.logger.kafka.producer.async.DefaultEventHandler=DEBUG, kafkaAppender
#log4j.logger.kafka.client.ClientUtils=DEBUG, kafkaAppender
#log4j.logger.kafka.perf=DEBUG, kafkaAppender
#log4j.logger.kafka.perf.ProducerPerformance$ProducerThread=DEBUG, kafkaAppender
#log4j.logger.org.I0Itec.zkclient.ZkClient=DEBUG
log4j.logger.kafka=INFO, kafkaAppender

log4j.logger.kafka.network.RequestChannel$=WARN, requestAppender
log4j.additivity.kafka.network.RequestChannel$=false

#log4j.logger.kafka.network.Processor=TRACE, requestAppender
#log4j.logger.kafka.server.KafkaApis=TRACE, requestAppender
#log4j.additivity.kafka.server.KafkaApis=false
log4j.logger.kafka.request.logger=WARN, requestAppender
log4j.additivity.kafka.request.logger=false

log4j.logger.kafka.controller=TRACE, controllerAppender
log4j.additivity.kafka.controller=false

log4j.logger.kafka.log.LogCleaner=INFO, cleanerAppender
log4j.additivity.kafka.log.LogCleaner=false

log4j.logger.state.change.logger=TRACE, stateChangeAppender
log4j.additivity.state.change.logger=false

Alternatively, you can add only the appenders you need.

More Information

For more information, see the official Kafka documentation.

When using Kafka, consider the following:
  • Use Cloudera Manager to start and stop Kafka and ZooKeeper services. Do not use the kafka-server-start, kafka-server-stop, zookeeper-server-start, and zookeeper-server-stop commands.
  • All Kafka command-line tools are located in /opt/cloudera/parcels/KAFKA/lib/kafka/bin/.
  • Set the JAVA_HOME environment variable to your JDK installation directory before using the command-line tools. For example:
    export JAVA_HOME=/usr/java/jdk1.7.0_55-cloudera