CDH 6 includes Apache Kafka as part of the core package. The documentation includes improved contents for how to set up, install, and administer your Kafka ecosystem. For more information, see the Cloudera Enterprise 6.0.x Apache Kafka Guide. We look forward to your feedback on both the existing and new documentation.

What's New in CDK Powered By Apache Kafka?

New Features in CDK 3.1.0 Powered By Apache Kafka

  • Rebase on Kafka 1.0.1

    CDK 3.1.0 Powered By Apache Kafka is a minor release based on Apache Kafka 1.0.1.

    For upstream release notes, see Apache Kafka version 1.0.0 and 1.0.1 release notes.

  • Kafka uses HA-capable Sentry client

    This functionality enables automatic failover in the event that the primary Sentry host goes down or is unavailable.

  • Wildcard usage for Kafka-Sentry components

    You can specify an asterisk (*) in a Kafa-Sentry command for the TOPIC component of a privilege to refer to any topic in the privilege. Supported with CDH 5.14.2.

    You can also use an asterisk (*) in a Kafka-Sentry command for the CONSUMERGROUPS component of a privilege to refer to any consumer groups in the privilege. This is useful when used with Spark Streaming, where a generated group.id may be needed. Supported with CDH 5.14.2.

  • Health Tests in Cloudera Manager

    Two new Kafka Broker Health Tests have been added to Cloudera Manager:

    • Kafka Broker Swap Memory Usage
    • Kafka Broker Unexpected Exits

    These health tests are available when Kafka is managed by Cloudera Manager version 5.14 and later. For details, see Kafka Broker Health Tests.

New Features in CDK 3.0.0 Powered By Apache Kafka

  • Rebase on Kafka 0.11.0.0

    CDK 3.0.0 Powered By Apache Kafka is a major release based on Apache Kafka 0.11.0.0. See https://archive.apache.org/dist/kafka/0.11.0.0/RELEASE_NOTES.html.

  • Health test for offline and lagging partitions

    New health tests set the controller broker's health to BAD if the broker hosts at least one offline partition and the leader broker's health to CONCERNING if it hosts any lagging partitions. Supported with Cloudera Manager 5.14.0.

New Features in CDK 2.2.0 Powered By Apache Kafka

New Features in CDK 2.1.0 Powered By Apache Kafka

New Features in Cloudera Distribution CDK 2.0.0 Powered By Apache Kafka

  • Rebase on Kafka 0.9

    CDK 2.0.0 Powered By Apache Kafka is rebased on Apache Kafka 0.9. See https://archive.apache.org/dist/kafka/0.9.0.0/RELEASE_NOTES.html.

  • Kerberos

    CDK 2.0.0 Powered By Apache Kafka supports Kerberos authentication of connections from clients and other brokers, including to ZooKeeper.

  • SSL

    CDK 2.0.0 Powered By Apache Kafka supports wire encryption of communications from clients and other brokers using SSL.

  • New Consumer API

    CDK 2.0.0 Powered By Apache Kafka includes a new Java API for consumers.

  • MirrorMaker

    MirrorMaker is enhanced to help prevent data loss and improve reliability of cross-data center replication.

  • Quotas

    You can use per-user quotas to throttle producer and consumer throughput in a multitenant cluster. See Quotas.

New Features in CDK 1.4.0 Powered By Apache Kafka

New Features in CDK 1.3.2 Powered By Apache Kafka

New features in CDK 1.3.0 Powered By Apache Kafka

  • Metrics Reporter

    Cloudera Manager now displays Kafka metrics. Use the values to identify current performance issues and plan enhancements to handle anticipated changes in workload. See Viewing Apache Kafka Metrics.

  • MirrorMaker configuration

    Cloudera Manager allows you to configure the Kafka MirrorMaker cross-cluster replication service. You can add a MirrorMaker role and use it to replicate to a machine in another cluster. See Kafka MirrorMaker.

New Features in CDK 1.1.0 Powered By Apache Kafka

  • New producer

    The producer added in CDK 1.1.0 Powered By Apache Kafka combines features of the existing synchronous and asynchronous producers. Send requests are batched, allowing the new producer to perform as well as the asynchronous producer under load. Every send request returns a response object that can be used to retrieve status and exceptions.

  • Ability to delete topics

    You can now delete topics using the kafka-topics --delete command.

  • Offset management

    In previous versions, consumers that wanted to keep track of which messages were consumed did so by updating the offset of the last consumed message in ZooKeeper. With this new feature, Kafka itself tracks the offsets. Using offset management can significantly improve consumer performance.

  • Automatic leader rebalancing

    Each partition starts with a randomly selected leader replica that handles requests for that partition. When a cluster first starts, the leaders are evenly balanced among hosts. When a broker restarts, leaders from that broker are distributed to other brokers, which results in an unbalanced distribution. With this feature enabled, leaders are assigned to the original replica after a restart.

  • Connection quotas

    Kafka administrators can limit the number of connections allowed from a single IP address. By default, this limit is 10 connections per IP address. This prevents misconfigured or malicious clients from destabilizing a Kafka broker by opening a large number of connections and using all available file handles.