CDH 6 includes Apache Kafka as part of the core package. The documentation includes improved contents for how to set up, install, and administer your Kafka ecosystem. For more information, see the Cloudera Enterprise 6.0.x Apache Kafka Guide. We look forward to your feedback on both the existing and new documentation.

CDK Powered By Apache Kafka Incompatible Changes and Limitations

Incompatible Changes and Limitations in CDK 4.1.0 Powered By Apache Kafka

Default Consumer Group ID Change

The default consumer group ID has been changed from the empty string ("") to null. Consumers that use the new default group ID will not be able to subscribe to topics, and fetch or commit offsets. The empty string as consumer group ID is deprecated but will be supported until a future major release. Old clients that rely on the empty string group id will now have to explicitly provide it as part of their consumer configuration. For more information, see KIP-289.

Incompatible Changes and Limitations in CDK 4.0.0 Powered By Apache Kafka

Scala-based Clients API Removed

Scala-based clients were deprecated in a previous release and are removed as of CDK 4.0.0. The following Scala-based client implementations from package kafka.* (known as 'old clients') are affected:
  • kafka.consumer.*
  • kafka.producer.*
  • kafka.admin.*
Client applications making use of these implementations must be migrated to corresponding Java clients available in org.apache.kafka.* (the 'new clients') package. Existing command line options and tools now use the new clients package.

Properties for Exactly Once Semantics Not Available in Cloudera Manager

The configuration properties related the idempotent and transactional capabilities of the producer are not available for configuration via Cloudera Manager. These properties must be set through the Kafka Broker Advanced Configuration Snippet (Safety Valve) for kafka.properties safety valve. For more information regarding configuration using safety valves, see Custom Configuration.

The following are the configuration properties related to the idempotent and transactional capabilities of the producer:
  • Broker Properties
    • transactional.id.expiration.ms
    • transaction.max.timeout.ms
    • transaction.state.log.replication.factor
    • transaction.state.log.num.partitions
    • transaction.state.log.min.isr
    • transaction.state.log.segment.bytes
  • Producer Properties
    • enable.idempotence
    • transaction.timeout.ms
    • transactional.id
  • Consumer Properties
    • isolation.level

For more information, see the upstream Apache Kafka documentation.

Default Behaviour Changes in CDK 4.0.0 Powered by Apache Kafka

Kafka CDK 4.0.0. Introduces the following default behaviour changes:
  • Unclean leader election is automatically enabled by the controller when unclean.leader.election.enable config is dynamically updated by using per-topic config override.
  • The default value for request.timeout.ms is decreased to 30 seconds. In addition, a new logic is added that makes the JoinGroup requests ignore this timeout.

Incompatible Changes and Limitations in CDK 3.1.0 Powered By Apache Kafka

Scala-based Clients API Deprecated

Scala-based clients are deprecated in this release and will be removed in an upcoming release. The following Scala-based client implementations from package kafka.*(known as 'old clients') are affected:
  • kafka.consumer.*
  • kafka.producer.*
  • kafka.admin.*
Client applications making use of these implementations must be migrated to corresponding Java clients available in org.apache.kafka.* (the 'new clients') package. Existing command line options and tools now use the new clients package.

Incompatible Changes and Limitations in CDK 3.0.0 Powered By Apache Kafka

CDK 3.0 Requires CDH 5.13 when Co-located

Using version 3.0 and later of CDK Powered by Apache Kafka requires a newer version of Cloudera Manager and/or CDH when Kafka and CDH are in the same logical cluster in Cloudera Manager. For more information on compatibilities among versions, see Product Compatibility Matrix for CDK Powered By Apache Kafka.

Incompatible Changes and Limitations in CDK 2.0.0 Powered By Apache Kafka

Flume shipped with CDH 5.7 and lower can only send data to CDK 2.0 and higher Powered By Apache Kafka via unsecured transport.

Security additions to CDK 2.0 Powered By Apache Kafka are not supported by Flume in CDH 5.7 (or lower versions).

Topic Blacklist Removed

The MirrorMaker Topic blacklist setting has been removed in CDK 2.0 and higher Powered By Apache Kafka.

Avoid Data Loss Option Removed

The Avoid Data Loss option from earlier releases has been removed in CDK 2.0 Powered By Apache Kafka in favor of automatically setting the following properties.

  1. Producer settings
    • acks=all
    • retries=max integer
    • max.block.ms=max long
  2. Consumer setting
    • auto.commit.enable=false
  3. MirrorMaker setting
    • abort.on.send.failute=true