Step 4: Create and Deploy the Kerberos Principals and Keytab Files

A Kerberos principal is used in a Kerberos-secured system to represent a unique identity. Kerberos assigns tickets to Kerberos principals to enable them to access Kerberos-secured Hadoop services. For Hadoop, the principals should be of the format username/fully.qualified.domain.name@YOUR-REALM.COM. In this guide, the term username in the username/fully.qualified.domain.name@YOUR-REALM.COM principal refers to the username of an existing Unix account, such as hdfs or mapred.

A keytab is a file containing pairs of Kerberos principals and an encrypted copy of that principal's key. The keytab files are unique to each host since their keys include the hostname. This file is used to authenticate a principal on a host to Kerberos without human interaction or storing a password in a plain text file. Because having access to the keytab file for a principal allows one to act as that principal, access to the keytab files should be tightly secured. They should be readable by a minimal set of users, should be stored on local disk, and should not be included in machine backups, unless access to those backups is as secure as access to the local machine.

When to Use kadmin.local and kadmin

When creating the Kerberos principals and keytabs, you can use kadmin.local or kadmin depending on your access and account:

  • If you have root access to the KDC machine, but you do not have a Kerberos admin account, use kadmin.local.
  • If you do not have root access to the KDC machine, but you do have a Kerberos admin account, use kadmin.
  • If you have both root access to the KDC machine and a Kerberos admin account, you can use either one.

To start kadmin.local (on the KDC machine) or kadmin from any machine, run this command:

$ sudo kadmin.local

OR:

$ kadmin

To create the Kerberos principals

Do the following steps for every host in your cluster. Run the commands in the kadmin.local or kadmin shell, replacing the fully.qualified.domain.name in the commands with the fully qualified domain name of each host. Replace YOUR-REALM.COM with the name of the Kerberos realm your Hadoop cluster is in.

  1. In the kadmin.local or kadmin shell, create the hdfs principal. This principal is used for the NameNode, Secondary NameNode, and DataNodes.
    kadmin:  addprinc -randkey hdfs/fully.qualified.domain.name@YOUR-REALM.COM
  2. Create the mapred principal. If you are using MRv1, the mapred principal is used for the JobTracker and TaskTrackers. If you are using YARN, the mapred principal is used for the MapReduce Job History Server.
    kadmin:  addprinc -randkey mapred/fully.qualified.domain.name@YOUR-REALM.COM
  3. YARN only: Create the yarn principal. This principal is used for the ResourceManager and NodeManager.
    kadmin:  addprinc -randkey yarn/fully.qualified.domain.name@YOUR-REALM.COM
  4. Create the HTTP principal.
    kadmin:  addprinc -randkey HTTP/fully.qualified.domain.name@YOUR-REALM.COM

To create the Kerberos keytab files

Do the following steps for every host in your cluster. Run the commands in the kadmin.local or kadmin shell, replacing the fully.qualified.domain.name in the commands with the fully qualified domain name of each host:

  1. Create the hdfs keytab file that will contain the hdfs principal and HTTP principal. This keytab file is used for the NameNode, Secondary NameNode, and DataNodes.
    kadmin:  xst -norandkey -k hdfs.keytab hdfs/fully.qualified.domain.name HTTP/fully.qualified.domain.name
  2. Create the mapred keytab file that will contain the mapred principal and HTTP principal. If you are using MRv1, the mapred keytab file is used for the JobTracker and TaskTrackers. If you are using YARN, the mapred keytab file is used for the MapReduce Job History Server.
    kadmin:  xst -norandkey -k mapred.keytab mapred/fully.qualified.domain.name HTTP/fully.qualified.domain.name
  3. YARN only: Create the yarn keytab file that will contain the yarn principal and HTTP principal. This keytab file is used for the ResourceManager and NodeManager.
    kadmin:  xst -norandkey -k yarn.keytab yarn/fully.qualified.domain.name HTTP/fully.qualified.domain.name
  4. Use klist to display the keytab file entries; a correctly-created hdfs keytab file should look something like this:
    $ klist -e -k -t hdfs.keytab
    Keytab name: WRFILE:hdfs.keytab
    slot KVNO Principal
    ---- ---- ---------------------------------------------------------------------
       1    7    HTTP/fully.qualified.domain.name@YOUR-REALM.COM (DES cbc mode with CRC-32)
       2    7    HTTP/fully.qualified.domain.name@YOUR-REALM.COM (Triple DES cbc mode with HMAC/sha1)
       3    7    hdfs/fully.qualified.domain.name@YOUR-REALM.COM (DES cbc mode with CRC-32)
       4    7    hdfs/fully.qualified.domain.name@YOUR-REALM.COM (Triple DES cbc mode with HMAC/sha1)
  5. Continue with the next section To deploy the Kerberos keytab files.

To deploy the Kerberos keytab files

On every node in the cluster, repeat the following steps to deploy the hdfs.keytab and mapred.keytab files. If you are using YARN, you will also deploy the yarn.keytab file.

  1. On the host machine, copy or move the keytab files to a directory that Hadoop can access, such as /etc/hadoop/conf.
    1. If you are using MRv1:

      $ sudo mv hdfs.keytab mapred.keytab /etc/hadoop/conf/

      If you are using YARN:

      $ sudo mv hdfs.keytab mapred.keytab yarn.keytab /etc/hadoop/conf/
    2. Make sure that the hdfs.keytab file is only readable by the hdfs user, and that the mapred.keytab file is only readable by the mapred user.
      $ sudo chown hdfs:hadoop /etc/hadoop/conf/hdfs.keytab
      $ sudo chown mapred:hadoop /etc/hadoop/conf/mapred.keytab
      $ sudo chmod 400 /etc/hadoop/conf/*.keytab
    3. YARN only: Make sure that the yarn.keytab file is only readable by the yarn user.
      $ sudo chown yarn:hadoop /etc/hadoop/conf/yarn.keytab
      $ sudo chmod 400 /etc/hadoop/conf/yarn.keytab