Introduction to Kerberos Principals and Keytabs
Hadoop security uses Kerberos principals and keytabs to perform user authentication on all remote procedure calls (RPCs). A Kerberos principal is used in a Kerberos-secured system to represent a unique identity. Kerberos assigns tickets to Kerberos principals to enable them to access Kerberos-secured Hadoop services. For the Hadoop daemon principals, the principal names should be of the format username/fully.qualified.domain.name@YOUR-REALM.COM. In this guide, the term username in the username/fully.qualified.domain.name@YOUR-REALM.COM principal refers to the username of an existing Unix account that is used by Hadoop daemons, such as hdfs or mapred. Human users who want to access the Hadoop cluster also need to have Kerberos principals; in this case, username refers to the username of the user's Unix account, such as joe or jane. Single-component principal names (such as joe@YOUR-REALM.COM) are acceptable for client user accounts. Hadoop does not support more than two-component principal names.
A keytab is a file containing pairs of Kerberos principals and an encrypted copy of that principal's key. A keytab file for a Hadoop daemon is unique to each host since the principal names include the hostname. This file is used to authenticate a principal on a host to Kerberos without human interaction or storing a password in a plain text file. Because having access to the keytab file for a principal allows one to act as that principal, access to the keytab files should be tightly secured. They should be readable by a minimal set of users, should be stored on local disk, and should not be included in machine backups, unless access to those backups is as secure as access to the local machine.
For more details about the security features in CDH3, see the "Introduction to Hadoop Security" section of the CDH3 Security Guide. For more details about the security features in CDH4, see the "Introduction to Hadoop Security" section of the CDH4 Security Guide.