CDH 6 includes Apache Kafka as part of the core package. The documentation includes improved contents for how to set up, install, and administer your Kafka ecosystem. For more information, see the Cloudera Enterprise 6.0.x Apache Kafka Guide. We look forward to your feedback on both the existing and new documentation.

Known Issues in CDK Powered By Apache Kafka

Unsupported features

  • Kafka Connect is included with CDK 2.0.0 and higher Powered By Apache Kafka, but is not supported at this time. Instead, we recommend Flume and Sqoop as proven solutions for batch and real time data loading that complement Kafka's message broker capability. See Flafka: Apache Flume Meets Apache Kafka for Event Processing and for more information.

    In addition, Spark and Spark Streaming can be used to get the functionality of "Kafka Streaming" and have a fully functional ETL or stream processing pipeline. See Using Apache Kafka with Apache Spark Streaming for more information.

  • The Kafka default authorizer is included with CDK 2.0.0 and higher Powered By Apache Kafka, but is not supported at this time. This includes setting ACLs and all related APIs, broker functionality, and command-line tools.

Kafka does not work with Apache Sentry HA

For CDK 3.0.0 and earlier Powered By Apache Kafka, you cannot use Sentry high availability with Kafka. If Sentry HA is enabled, Kafka might intermittently lose the connection to Sentry and you won't be able to authorize users.

Affected Versions: All versions of CDK Powered By Apache Kafka with CDH 5.13.x and 5.14.x

Fixed Versions: CDK 3.1.0

Cloudera JIRA: CDH-56519

Topics created with the kafka-topics tool may not be secured

Topics that are created and deleted via Kafka are secured (for example, auto created topics). However, most topic creation and deletion is done via the kafka-topics tool, which talks directly to Zookeeper or some other third-party tool that talks directly to Zookeeper. Since this is the responsibility of Zookeeper authorization and authentication, Kafka cannot prevent users from making Zookeeper changes. Anyone with access to Zookeeper can create and delete topics. Note that they will not be able to describe, read, or write to the topics even if they can create them.

The following commands talk directly to Zookeeper and therefore are not secured via Kafka:
  • kafka-topics.sh
  • kafka-configs.sh
  • kafka-preferred-replica-election.sh
  • kafka-reassign-partitions.sh

Replication Factor in Kafka Streams is set to 1 by Default

In Kafka Streams the replication.factor Streams configuration parameter is set to 1 by default. In other words, the internal topics that the Streams application creates are not replicated by default. Without replication, even a single broker failure can prevent progress of the stream processing application which in turn can lead to data being lost. .

Workaround: Set the replication.factor Streams configuration parameter to a value higher than 1. Cloudera recommends that the replication factor set for the Streams application should be identical to the replication factor of the source topics. For more information, see Configuring a Streams Application

Affected Versions: CDK 4.0.0 or higher

Fixed Versions:N/A

Cloudera Issue: N/A

Apache Issue: N/A

Topic-level metrics do not display in Cloudera Manager for topics that contain a period (.) in the topic name

If you have Kafka topics that contain a period (.) in the topic name, Cloudera Manager might not display the topic-level metrics for those topics in the Charts Library. Only topic-level metrics are affected.

Affected Versions: CDK 3.0.0 Powered By Apache Kafka

Fixed Versions: CDK 3.1.0

Cloudera JIRA: CDH-64370

offsets.topic.replication.factor must be less than or equal to the number of live brokers (CDK 3.0.0 Powered By Apache Kafka)

In CDK 3.0.0 Powered By Apache Kafka, the offsets.topic.replication.factor broker config is now enforced upon auto topic creation. Internal auto topic creation will fail with a GROUP_COORDINATOR_NOT_AVAILABLE error until the cluster size meets this replication factor requirement.

Kafka stuck with under-replicated partitions after ZooKeeper session expires

This problem might occur when your Kafka cluster includes a large number of under-replicated Kafka partitions. One or more broker logs include messages such as the following:

[2016-01-17 03:36:00,888] INFO Partition [__samza_checkpoint_event-creation_1,3] on broker 3: Shrinking ISR for partition [__samza_checkpoint_event-creation_1,3] from 6,5 to 5 (kafka.cluster.Partition)
[2016-01-17 03:36:00,891] INFO Partition [__samza_checkpoint_event-creation_1,3] on broker 3: Cached zkVersion [66] not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition)
          

There will also be an indication of the ZooKeeper session expiring in one or more Kafka broker logs around the same time as the previous errors:

          INFO zookeeper state changed (Expired) (org.I0Itec.zkclient.ZkClient)
          

The log is typically in /var/log/kafka on each host where a Kafka broker is running. The location is set by the property kafka.log4j.dir in Cloudera Manager. The log name is kafka-broker-hostname.log. In diagnostic bundles, the log is under logs/ hostname-ip-address/.

Affected Versions: CDK 1.4.x, 2.0.x, 2.1.x, 2.2.x Powered By Apache Kafka

Fixed Versions:
  • Full Fix: CDK 4.0.0 and higher Powered By Apache Kafka
  • Partial Fix: CDK 3.0.0 and higher Powered By Apache Kafka are less likely to encounter this issue.

Workaround: To move forward after seeing this problem, restart the Kafka brokers affected. You can restart individual brokers from the Instances tab in the Kafka service page in Cloudera Manager.

To reduce the chances of this issue happening again, do what you can to make sure ZooKeeper sessions do not expire:

  • Reduce the potential for long garbage collection pauses by brokers:
    • Use a better garbage collection mechanism in the JVM, such as G1GC. You can do this by adding –XX:+UseG1GC in the broker_java_opts.
    • Increase broker heap size if it is too small (broker_max_heap_size) (be careful that you don’t choose a heap size that can cause out-of-memory problems given all the services running on the node).
  • Increase the ZooKeeper session timeout configuration on brokers (zookeeper.session.timeout.ms), to reduce the likelihood that sessions expire.
  • Ensure ZooKeeper itself is well resourced and not overwhelmed, so it can respond. For example, it is highly recommended to locate the ZooKeeper log directory is on its own disk.

Cloudera JIRA: CDH-42514

Apache JIRA: KAFKA-2729

Kafka client jars included in CDH might not match the newest Kafka parcel jar

The Kafka client jars included in CDH may not match the newest Kafka parcel jar that is released. This is done to maintain compatibility across CDH 5.7 and higher for integrations such as Spark and Flume.

The Flume and Spark connectors to Kafka shipped with CDH 5.7 and higher only work with CDK 2.x Powered By Apache Kafka

Use CDK 2.x and higher Powered By Apache Kafka to be compatible with the Flume and Spark connectors included with CDH 5.7.x.

Only new Java clients support authentication and authorization

The legacy Scala clients (producer and consumer) that are under the kafka.producer.* and kafka.consumer.* package do not support authentication.

Workaround: Migrate to the new Java producer and consumer APIs.

Requests fail when sending to a nonexistent topic with auto.create.topics.enable set to true

The first few produce requests fail when sending to a nonexistent topic with auto.create.topics.enable set to true.

Affected Versions: All

Workaround: Increase the number of retries in the Producer configuration settings.

Custom Kerberos principal names must not be used for Kerberized ZooKeeper and Kafka instances

When using ZooKeeper authentication and a custom Kerberos principal, Kerberos-enabled Kafka does not start.

Affected Versions: CDK 2.0.0 and higher Powered By Apache Kafka

Workaround: None. You must disable ZooKeeper authentication for Kafka or use the default Kerberos principals for ZooKeeper and Kafka.

Performance degradation when SSL is enabled

Significant performance degradation can occur when SSL is enabled. The impact varies, depending on your CPU type and JVM version. The reduction is generally in the range 20-50%. Consumers are typically more affected than producers.

Affected Versions: CDK 2.0.0 and higher Powered By Apache Kafka

Workaround for CDK 2.1.0 and higher Powered By Apache Kafka: Configure brokers and clients with ssl.secure.random.implementation = SHA1PRNG. It often reduces this degradation drastically, but it's effect is CPU and JVM dependent.

AdminUtils is not binary-compatible between CDK 1.x and 2.x Powered By Apache Kafka

The AdminUtils APIs have changed between CDK 1.x and 2.x Powered By Apache Kafka. If your application uses AdminUtils APIs, you must modify your application code to use the new APIs before you compile your application against CDK 2.x Powered By Apache Kafka.

Source cluster not definable in CDK 1.x Powered By Apache Kafka

In CDK 1.x Powered By Apache Kafka, the source cluster is assumed to be the cluster that MirrorMaker is running on. In CDK 2.0 Powered By Apache Kafka, you can define a custom source and target cluster.

Monitoring is not supported in Cloudera Manager 5.4

If you use CDK 1.2 Powered By Apache Kafka with Cloudera Manager 5.4, you must disable monitoring.

Authenticated Kafka clients may impersonate other users

Authenticated Kafka clients may impersonate any other user via a manually crafted protocol message with SASL/PLAIN or SASL/SCRAM authentication when using the built-in PLAIN or SCRAM server implementations in Apache Kafka.Note that the SASL authentication mechanisms that apply to this issue are neither recommended nor supported by Cloudera. In Cloudera Manager (CM) there are four choices: PLAINTEXT, SSL, SASL_PLAINTEXT, and SASL_SSL. The SASL/PLAIN option described in this issue is not the same as SASL_PLAINTEXT option in CM. That option uses Kerberos and is not affected. As a result it is highly unlikely that Kafka is susceptible to this issue when managed by CM unless the authentication protocol is overridden by an Advanced Configuration Snippet (Safety Valve).

Products affected: CDK Powered by Apache Kafka

Releases affected: CDK 2.1.0 to 2.2.0, CDK 3.0

Users affected: All users

Detected by: Rajini Sivaram (rsivaram@apache.org)

Severity (Low/Medium/High):8.3 (High) (CVSS:3.0/AV:N/AC:L/PR:L/UI:N/S:U/C:L/I:H/A:H)

Impact:Privilege escalation.

CVE:CVE-2017-12610

Immediate action required: Upgrade to a newer version of CDK Powered by Apache Kafka where the issue has been fixed.

Addressed in release/refresh/patch: CDK 3.1, CDH 6.0 and higher

Knowledge article: For the latest update on this issue see the corresponding Knowledge article: TSB 2018-332: Two Kafka Security Vulnerabilities: Authenticated Kafka clients may impersonate other users and and may interfere with data replication

Authenticated clients may interfere with data replication

Authenticated Kafka users may perform an action reserved for the Broker via a manually created fetch request interfering with data replication, resulting in data loss.

Products affected: CDK Powered by Apache Kafka

Releases affected: CDK 2.0.0 to 2.2.0, CDK 3.0.0

Users affected: All users

Detected by: Rajini Sivaram (rsivaram@apache.org)

Severity (Low/Medium/High):6.3 (Medium) (CVSS:3.0/AV:N/AC:L/PR:L/UI:N/S:U/C:L/I:L/A:L)

Impact:Potential data loss due to improper replication.

CVE:CVE-2018-1288

Immediate action required: Upgrade to a newer version of CDK Powered by Apache Kafka where the issue has been fixed.

Addressed in release/refresh/patch: CDK 3.1, CDH 6.0 and higher

Knowledge article: For the latest update on this issue see the corresponding Knowledge article: TSB 2018-332: Two Kafka Security Vulnerabilities: Authenticated Kafka clients may impersonate other users and and may interfere with data replication