This is the documentation for Cloudera 5.3.x.
Documentation for other versions is available at Cloudera Documentation.

Configuring LDAP Group Mappings

  Important: Cloudera strongly recommends against using Hadoop's LdapGroupsMapping provider. LdapGroupsMapping should only be used in cases where OS-level integration is not possible. Production clusters require an identity provider that works well with all applications, not just Hadoop. Hence, often the preferred mechanism is to use tools such as SSSD, VAS or Centrify to replicate LDAP groups.
When configuring LDAP for group mappings in Hadoop, you must create the users and groups for your Hadoop services in LDAP. When using the default shell-based group mapping provider (org.apache.hadoop.security.ShellBasedUnixGroupsMapping), the requisite user and group relationships already exist because they are created during the installation procedure. When you switch to LDAP as the group mapping provider, you must re-create these relationships within LDAP.

Not that if you have modified the System User or System Group setting within Cloudera Manager for any service, you must use those custom values to provision the users and groups in LDAP.

The table below lists users and their group members for CDH services:
  Note: Cloudera Manager 5.3 introduces a new single user mode. In single user mode, the Cloudera Manager Agent and all the processes run by services managed by Cloudera Manager are started as a single configured user and group. See Single User Mode Requirements for more information.
Table 1. Users and Groups

Component (Version)

Unix User ID Groups Notes
Cloudera Manager (all versions) cloudera-scm cloudera-scm Cloudera Manager processes such as the Cloudera Manager Server and the monitoring roles run as this user.
The Cloudera Manager keytab file must be named cmf.keytab since that name is hard-coded in Cloudera Manager.
  Note: Applicable to clusters managed by Cloudera Manager only.
Apache Accumulo (Accumulo 1.4.3 and higher) accumulo accumulo Accumulo processes run as this user.
Apache Avro   No special users.
Apache Flume (CDH 4, CDH 5) flume flume The sink that writes to HDFS as this user must have write privileges.
Apache HBase (CDH 4, CDH 5) hbase hbase The Master and the RegionServer processes run as this user.
HDFS (CDH 4, CDH 5) hdfs hdfs, hadoop The NameNode and DataNodes run as this user, and the HDFS root directory as well as the directories used for edit logs should be owned by it.
Apache Hive (CDH 4, CDH 5) hive hive

The HiveServer2 process and the Hive Metastore processes run as this user.

A user must be defined for Hive access to its Metastore DB (e.g. MySQL or Postgres) but it can be any identifier and does not correspond to a Unix uid. This is javax.jdo.option.ConnectionUserName in hive-site.xml.

Apache HCatalog (CDH 4.2 and higher, CDH 5) hive hive

The WebHCat service (for REST access to Hive functionality) runs as the hive user.

HttpFS (CDH 4, CDH 5) httpfs httpfs

The HttpFS service runs as this user. See HttpFS Security Configuration for instructions on how to generate the merged httpfs-http.keytab file.

Hue (CDH 4, CDH 5) hue hue

Hue services run as this user.

Cloudera Impala (CDH 4.1 and higher, CDH 5) impala impala, hadoop, hdfs, hive Impala services run as this user.
Apache Kafka (Cloudera Labs 1.0.0) kafka kafka Kafka services run as this user.
  Important: Cloudera does not currently recommend this feature for production use and it is not supported.
KMS (File) (CDH 5.2.1 and higher) kms kms The KMS (File) service runs as this user.
KMS (Navigator Key Trustee) (CDH 5.3 and higher) kms kms The KMS (Navigator Key Trustee) service runs as this user.
Llama (CDH 5) llama llama Llama runs as this user.
Apache Mahout   No special users.
MapReduce (CDH 4, CDH 5) mapred mapred, hadoop Without Kerberos, the JobTracker and tasks run as this user. The LinuxTaskController binary is owned by this user for Kerberos.
Apache Oozie (CDH 4, CDH 5) oozie oozie The Oozie service runs as this user.
Parquet   No special users.
Apache Pig   No special users.
Cloudera Search (CDH 4.3 and higher, CDH 5) solr solr The Solr processes run as this user.
Apache Spark (CDH 5) spark spark The Spark History Server process runs as this user.
Apache Sentry (incubating) (CDH 5.1 and higher) sentry sentry The Sentry service runs as this user.
Apache Sqoop (CDH 4, CDH 5) sqoop sqoop This user is only for the Sqoop1 Metastore, a configuration option that is not recommended.
Apache Sqoop2 (CDH 4.2 and higher, CDH 5) sqoop2 sqoop, sqoop2 The Sqoop2 service runs as this user.
Apache Whirr   No special users.
YARN (CDH 4, CDH 5) yarn yarn, hadoop Without Kerberos, all YARN services and applications run as this user. The LinuxContainerExecutor binary is owned by this user for Kerberos.
Apache ZooKeeper (CDH 4, CDH 5) zookeeper zookeeper The ZooKeeper processes run as this user. It is not configurable.
  Important:
  • You can use either Cloudera Manager or the following command-line instructions to complete this configuration.
  • This information applies specifically to CDH 5.3.x. If you use an earlier version of CDH, see the documentation for that version located at Cloudera Documentation.

Using Cloudera Manager

Required Role:

Make the following changes to the HDFS service's security configuration:
  1. Open the Cloudera Manager Admin Console and navigate to the HDFS service.
  2. Click the Configuration tab.
  3. Modify the following configuration properties under the Service-Wide > Security section. The table below lists the properties and the value to be set for each property.
    Configuration Property Value
    Hadoop User Group Mapping Implementation org.apache.hadoop.security.LdapGroupsMapping
    Hadoop User Group Mapping LDAP URL ldap://<server>
    Hadoop User Group Mapping LDAP Bind User Administrator@example.com
    Hadoop User Group Mapping LDAP Bind User Password ***
    Hadoop User Group Mapping Search Base dc=example,dc=com
Although the above changes are sufficient to configure group mappings for Active Directory, some changes to the remaining default configurations might be required for OpenLDAP.

Using the Command Line

Add the following properties to the core-site.xml on the NameNode:
<property>    
<name>hadoop.security.group.mapping</name>
<value>org.apache.hadoop.security.LdapGroupsMapping</value>  
</property>  

<property>
<name>hadoop.security.group.mapping.ldap.url</name>
<value>ldap://server</value>  
</property>  

<property>
<name>hadoop.security.group.mapping.ldap.bind.user</name>
<value>Administrator@example.com</value>
</property>  

<property>
<name>hadoop.security.group.mapping.ldap.bind.password</name>    
<value>****</value>
</property>  

<property>
<name>hadoop.security.group.mapping.ldap.base</name>
<value>dc=example,dc=com</value>  
</property>  

<property>
<name>hadoop.security.group.mapping.ldap.search.filter.user</name>
<value>(&amp;(objectClass=user)(sAMAccountName={0}))</value>  
</property>  

<property>
<name>hadoop.security.group.mapping.ldap.search.filter.group</name>  
<value>(objectClass=group)</value>  
</property>  

<property>
<name>hadoop.security.group.mapping.ldap.search.attr.member</name>    
<value>member</value>
</property>  

<property>
<name>hadoop.security.group.mapping.ldap.search.attr.group.name</name>    
<value>cn</value>
</property>
  Note: In addition:
  • If you are using Sentry with Hive, you will also need to add these properties on the HiveServer2 node.
  • If you are using Sentry with Impala, add these properties on all hosts
See Users and Groups in Sentry for more information.