This is the first installment in a short series of blog posts about security in Apache Kafka. In this article we will explain how to configure clients to authenticate with clusters using different authentication mechanisms.
Secured Apache Kafka clusters can be configured to enforce authentication using different methods, including the following:
In this article we will start looking into Kerberos authentication and will focus on the client-side configuration required to authenticate with clusters configured to use Kerberos. The other authentication mechanisms will be covered in subsequent articles in this series.
We will not cover the server-side configuration in this article but will add some references to it when required to make the examples clearer.
The examples shown here will highlight the authentication-related properties in bold font to differentiate them from other required security properties, as in the example below. TLS is assumed to be enabled for the Apache Kafka cluster, as it should be for every secure cluster.
security.protocol=SASL_SSL ssl.truststore.location=/opt/cloudera/security/jks/truststore.jks
We use the kafka-console-consumer for all the examples below. All the concepts and configurations apply to other applications as well.
Kerberos is by far the most common option we see being used in the field to secure Kafka clusters. It enables users to use their corporate identities, stored in services like Active Directory, RedHat IPA, and FreeIPA, which simplifies identity management. A kerberized Kafka cluster also makes it easier to integrate with other services in a Big Data ecosystem, which typically use Kerberos for strong authentication.
Kafka implements Kerberos authentication through the Simple Authentication and Security Layer (SASL) framework. SASL is an authentication framework, and a standard IETF protocol defined by RFC 4422. It supports multiple different authentication mechanisms and the one that implements Kerberos authentication is called GSSAPI.
The basic Kafka client properties that must be set to configure the Kafka client to authenticate via Kerberos are shown below: # Uses SASL/GSSAPI over a TLS encrypted connection security.protocol=SASL_SSL sasl.mechanism=GSSAPI sasl.kerberos.service.name=kafka # TLS truststore ssl.truststore.location=/opt/cloudera/security/jks/truststore.jks
The configuration above uses Kerberos (SASL/GSSAPI) for authentication. TLS (SSL) is used for data encryption over the wire only.
The properties above, though, don’t provide the client with the credentials it needs to authenticate with the Kafka cluster. We need some more information.
When using Kerberos, we can provide the credentials to the client application in two ways. Either in the form of a valid Kerberos ticket, stored in a ticket cache, or as a keytab file, which the application can use to obtain a Kerberos ticket
The handling of the Kerberos credentials in a Kafka client is done by the Java Authentication and Authorization Service (JAAS) library. So we need to configure the client with the necessary information so that JAAS knows where to get the credentials from.
There are two ways to set those properties for the Kafka client:
In this section we show how to use both methods. The examples in this article will use the sasl.jaas.config method for simplicity.
If you are using a JAAS configuration file you need to tell the Kafka Java client where to find it. This is done by setting the following Java property in the command line:
... -Djava.security.auth.login.config=/path/to/jaas.conf ...
If you’re using Kafka command-line tools in the Cloudera Data Platform (CDP) this can be achieved by setting the following environment variable:
$ export KAFKA_OPTS="-Djava.security.auth.login.config=/path/to/jaas.conf"
The contents of the configuration file depend on where the credentials are being sourced from. To use a Kerberos ticket stored in the user’s ticket cache, use the following jaas.conf file:
KafkaClient { com.sun.security.auth.module.Krb5LoginModule required useTicketCache=true; };
To use a keytab, use the following instead:
KafkaClient { com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true keyTab="/etc/security/keytabs/alice.keytab" principal="alice@EXAMPLE.COM"; };
Instead of using a separate JAAS configuration file, I usually prefer setting the JAAS configuration for the client using the sasl.jaas.config Kafka property. This is usually simpler and gets rid of the additional configuration file (jaas.conf). The configurations below are equivalent to the jaas.conf configurations above.
Note: the settings below must be written in a single line. The semicolon at the end of the line is required.
To use a Kerberos ticket stored in a ticket cache:
sasl.jaas.config=com.sun.security.auth.module.Krb5LoginModule required useTicketCache=true;
To use a keytab, use the following instead:
sasl.jaas.config=com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true keyTab="/etc/security/keytabs/alice.keytab" principal="alice@EXAMPLE.COM";
The following is an example using the Kafka console consumer to read from a topic using Kerberos authentication and connecting directly to the broker (without using using a Load Balancer):
# Complete configuration file for Kerberos auth using the ticket cache $ cat krb-client.properties security.protocol=SASL_SSL sasl.mechanism=GSSAPI sasl.kerberos.service.name=kafka sasl.jaas.config=com.sun.security.auth.module.Krb5LoginModule required useTicketCache=true; ssl.truststore.location=/opt/cloudera/security/jks/truststore.jks # Authenticate with Kerberos to get a valid ticket $ kinit alice Password for alice@REALM: # Connect to Kafka using the ticket in the ticket cache $ kafka-console-consumer \ --bootstrap-server host-1.example.com:9093 \ --topic test \ --consumer.config /path/to/krb-client.properties
A central component of Kerberos is the Kerberos Distribution Center (KDC). The KDC is the service that handles all the Kerberos authentication initiated by the client. For Kerberos authentication to work, both the Kafka cluster and the clients must have connectivity to the KDC.
In a corporate environment, this is easily achievable and it is usually the case. In some deployments, though, the KDC may be placed behind a firewall, making it impossible for the clients to reach it to get a valid ticket.
Cloud and hybrid deployments (cloud + on-prem) can make it a challenge for clients to use Kerberos authentication, as the on-prem KDC is usually not integrated into the cloud-deployed services. However, since Kafka supports other authentication mechanisms, clients have other alternatives at their disposal, as we’ll explore in the next article.
In the meantime, if you are interested in understanding Cloudera’s Kafka offering, download this white paper.
This may have been caused by one of the following: