Configuring LDAP Group Mappings
Each host that comprises a node in a Cloudera cluster runs an operating system, such as CentOS or Oracle Linux. At the OS-level, there are user:group accounts created during installation that map to the services running on that specific node of the cluster. The default shell-based group mapping provider, org.apache.hadoop.security.ShellBasedUnixGroupsMapping, handles the mapping from the local host system (the OS) to the specific cluster service, such as HDFS. The hosts authenticate using these local OS accounts before processes are allowed to run on the node.
For clusters integrated with Kerberos for authentication, the hosts must also provide Kerberos tickets before processes can run on the node. The cluster can use the organization's LDAP directory service to provide the login credentials, including Kerberos tickets, to authenticate transparently while the system runs. That means that the local user:group accounts on each host must be mapped to LDAP accounts. To map local user:group accounts to an LDAP service:
- Use tools such as SSSD (Systems Security Services Daemon) or Centrify Server Suite (see Identity and Access management for Cloudera).
- The Hadoop LdapGroupsMapping group mapping mechanism. The LdapGroupsMapping library may not be as robust a solution needed for large organizations in terms of scalability and manageability, especially for organizations managing identity across multiple systems and not exclusively for Hadoop clusters. Support for the LdapGroupsMapping library is not consistent across all operating systems.
- Do not use Winbind to map Linux user:group accounts to Active Directory. It cannot scale, impedes cluster performance, and is not supported.
- Use the same user:group mappings across all cluster nodes, for ease of management.
- Use either Cloudera Manager or the command-line process detailed below.
The local user:group accounts must be mapped to LDAP for group mappings in Hadoop. You must create the users and groups for your Hadoop services in LDAP.
To integrate the cluster with an LDAP service, the user:group relationships must be contained in the LDAP directory. The admin must create the user accounts and define groups for user:group relationships on each host.
The user and group names listed in the table are the default user:group values for CDH services.
|Cloudera Product or Component||User||Group|
|Apache Avro||(No default)||(No default)|
|Apache Flume (sink writing to HDFS needs write privileges)||flume||flume|
|Apache HBase (Master, RegionServer processes)||hbase||hbase|
|Apache HCatalog, WebHCat service||hive||hive|
|Apache Hive (HiveServer2, Hive Metastore)||hive||hive|
|Apache Mahout||(No default)||(No default)|
|Apache Pig||(No default)||(No default)|
|Apache Sqoop2||sqoop2||sqoop, sqoop2|
|Apache Whirr||(No default)||(No default)|
|HDFS (NameNode, DataNodes)||hdfs||hdfs, hadoop|
|Hue Load Balancer (needs apache2 package)||apache||apache|
|Java KeyStore KMS||kms||kms|
|Key Trustee KMS||kms||kms|
|Key Trustee Server||keytrustee||keytrustee|
|MapReduce (JobTracker, TaskTracker)||mapred||mapred, hadoop|
|Parquet||(No default)||(No default)|
- For Sentry with Hive, add these properties on the HiveServer2 node.
- For Sentry with Impala, add these properties to all hosts.
Using Cloudera Manager
Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)
- Open the Cloudera Manager Admin Console and go to the HDFS service.
- Click the Configuration tab.
- Select .
- Modify the following configuration properties using values from the table below:
Configuration Property Value Hadoop User Group Mapping Implementation org.apache.hadoop.security.LdapGroupsMapping Hadoop User Group Mapping LDAP URL ldap://<server> Hadoop User Group Mapping LDAP Bind User Administrator@example.com Hadoop User Group Mapping LDAP Bind User Password *** Hadoop User Group Mapping Search Base dc=example,dc=com