Enabling Replication Between Clusters with Kerberos Authentication

Minimum Required Role: Cluster Administrator (also provided by Full Administrator)

To enable replication between clusters, additional setup steps are required to ensure that the source and destination clusters can communicate.

Ports

When using BDR with Kerberos authentication enabled, BDR requires all the ports listed on the following page: Port Requirements for Backup and Disaster Recovery.

Additionally, the port used for the Kerberos KDC Server and KRB5 services must be open to all hosts on the destination cluster. By default, this is port 88.

Considerations for Realm Names

If the source and destination clusters each use Kerberos for authentication, use one of the following configurations to prevent conflicts when running replication jobs:
  • If the clusters do not use the same KDC (Kerberos Key Distribution Center), Cloudera recommends that you use different realm names for each cluster. Additionally, if you are replicating across clusters in two different realms, see the steps for HDFS, Hive, and Impala Replication and Hive and Impala Replication in Cloudera Manager 5.11 and Lower replication later in this topic to setup trust between those clusters.
  • You can use the same realm name if the clusters use the same KDC or different KDCs that are part of a unified realm, for example where one KDC is the master and the other is a slave KDC.

HDFS, Hive, and Impala Replication

  1. On the hosts in the destination cluster, ensure that the krb5.conf file (typically located at /etc/kbr5.conf) on each host has the following information:
    • The KDC information for the source cluster's Kerberos realm. For example:
      [realms]
       SRC.EXAMPLE.COM = {
        kdc = kdc01.src.example.com:88
        admin_server = kdc01.example.com:749
        default_domain = src.example.com
       }
       DST.EXAMPLE.COM = {
        kdc = kdc01.dst.example.com:88
        admin_server = kdc01.dst.example.com:749
        default_domain = dst.example.com
       }
    • Realm mapping for the source cluster domain. You configure these mappings in the [domain_realm] section. For example:
      [domain_realm]
       .dst.example.com = DST.EXAMPLE.COM
       dst.example.com = DST.EXAMPLE.COM
       .src.example.com = SRC.EXAMPLE.COM
       src.example.com = SRC.EXAMPLE.COM
  2. On the destination cluster, use Cloudera Manager to add the realm of the source cluster to the Trusted Kerberos Realms configuration property:
    1. Go to the HDFS service.
    2. Click the Configuration tab.
    3. In the search field type Trusted Kerberos to find the Trusted Kerberos Realms property.
    4. Click the plus sign icon, and then enter the source cluster realm.
    5. Click Save Changes to commit the changes.
  3. Go to Administration > Settings.
  4. In the search field, type domain name.
  5. In the Domain Name(s) field, enter any domain or host names you want to map to the destination cluster KDC. Use the plus sign icon to add as many entries as you need. The entries in this property are used to generate the domain_realm section in krb5.conf.
  6. If domain_realm is configured in the Advanced Configuration Snippet (Safety Valve) for remaining krb5.conf, remove the entries for it.
  7. Click Save Changes to commit the changes.

Hive and Impala Replication in Cloudera Manager 5.11 and Lower

  1. Perform the procedure described in the previous section.
  2. On the hosts in the source cluster, ensure that the krb5.conf file on each host has the following information:
    • The kdc information for the destination cluster's Kerberos realm.
    • Domain/host-to-realm mapping for the destination cluster NameNode hosts.
  3. On the source cluster, use Cloudera Manager to add the realm of the destination cluster to the Trusted Kerberos Realms configuration property.
    1. Go to the HDFS service.
    2. Click the Configuration tab.
    3. In the search field type "Trusted Kerberos" to find the Trusted Kerberos Realms property.
    4. Enter the destination cluster realm.
    5. Click Save Changes to commit the changes.

    It is not necessary to restart any services on the source cluster.

Kerberos Connectivity Test

As part of Test Connectivity, Cloudera Manager tests for properly configured Kerberos authentication on the source and destination clusters that run the replication. Test Connectivity runs automatically when you add a peer for replication, or you can manually initiate Test Connectivity from the Actions menu.

This feature is available when the source and destination clusters run Cloudera Manager 5.12 or later. You can disable the Kerberos connectivity test by setting feature_flag_test_kerberos_connectivity to false with the Cloudera Manager API: api/<version>/cm/config.

If the test detects any issues with the Kerberos configuration, Cloudera Manager provides resolution steps based on whether Cloudera Manager manages the Kerberos configuration file.

Cloudera Manager tests the following scenarios:
  • Whether both clusters have Kerberos enabled. If one cluster uses Kerberos but the other does not, replication is not supported.
  • Whether both clusters are in the same Kerberos realm. Clusters in the same realm must share the same KDC or the KDCs must be in a unified realm.
  • Whether clusters are in different Kerberos realms. If the clusters are in different realms, the destination cluster must be configured according to the following criteria:
    • Destination HDFS services must have the correct Trusted Kerberos Realms setting.
    • The krb5.conf file has the correct domain_realm mapping on all the hosts.
    • The krb5.conf file has the correct realms information on all the hosts.
  • Whether the local and peer KDC are running on an available port. The default port is 88.
After Cloudera Manager runs the tests, Cloudera Manager makes recommendations to resolve any Kerberos configuration issues.

Kerberos Recommendations

If Cloudera Manager manages the Kerberos configuration file, Cloudera Manager configures Kerberos correctly for you and then provides the set of commands that you must manually run to finish configuring the clusters. The following screen shots show the prompts that Cloudera Manager provides in cases of improper configuration:

Configuration changes:


Steps to complete configuration:


If Cloudera Manager does not manage the Kerberos configuration file, Cloudera manager provides the manual steps required to correct the issue. For example, the following screen shot shows the steps required to properly configure Kerberos: