Kafka's key strength is the ability to make high volume data available as a real-time stream for consumption in systems with very different requirement. Apache Kafka is a distributed publish-subscribe messaging system that is designed to be fast, scalable, and durable. Kafka’s flexibility makes it ideal for a wide variety of use cases, from replacing traditional message brokers, to collecting user activity data, aggregating logs, operational application metrics, and device instrumentation. This reference paper provides an overview of the general best practices for deploying and running Kafka as a component of Cloudera’s Enterprise Data Hub.
This browser does not support inline PDF's. Please download the pdf to view it.