Step 6: Enable Hadoop Security

Cloudera recommends that all of the Hadoop configuration files throughout the cluster have the same contents.

To enable Hadoop security, add the following properties to the core-site.xml file on every machine in the cluster:

<property>
  <name>hadoop.security.authentication</name>
  <value>kerberos</value> <!-- A value of "simple" would disable security. -->
</property>

<property>
  <name>hadoop.security.authorization</name>
  <value>true</value>
</property>

Enabling Service-Level Authorization for Hadoop Services

Service-level authorizations prevent users from accessing a cluster at the course-grained level. For example, when Authorized Users and Authorized Groups are setup properly, an unauthorized user cannot use the hdfs shell to list the contents of HDFS. This also limits the exposure of world-readable files to an explicit set of users instead of all authenticated users, which could be, for example, every user in Active Directory.

The hadoop-policy.xml file maintains access control lists (ACL) for Hadoop services. Each ACL consists of comma-separated lists of users and groups separated by a space. For example:
user_a,user_b group_a,group_b

If you only want to specify a set of users, add a comma-separated list of users followed by a blank space. Similarly, to specify only authorized groups, use a blank space at the beginning. A * can be used to give access to all users.

For example, to give users, ann, bob, and groups, group_a, group_b access to Hadoop's DataNodeProtocol service, modify the security.datanode.protocol.acl property in hadoop-policy.xml. Similarly, to give all users access to the InterTrackerProtocol service, modify security.inter.tracker.protocol.acl as follows:
<property>
    <name>security.datanode.protocol.acl</name>
    <value>ann,bob group_a,group_b</value>
    <description>ACL for DatanodeProtocol, which is used by datanodes to 
    communicate with the namenode.</description>
</property>

<property>
    <name>security.inter.tracker.protocol.acl</name>
    <value>*</value>
    <description>ACL for InterTrackerProtocol, which is used by tasktrackers to 
    communicate with the jobtracker.</description>
</property>

For more details, see Service-Level Authorization in Hadoop.