Hadoop Users in Cloudera Manager and CDH

A number of special users are created by default when installing and using CDH and Cloudera Manager. Given below is a list of users and groups as of the latest release. Also listed are the corresponding Kerberos principals and keytab files that should be created when you configure Kerberos security on your cluster.
Users and Groups

Component (Version)

Unix User ID Groups Notes
Cloudera Manager (all versions) cloudera-scm cloudera-scm Cloudera Manager processes such as the Cloudera Manager Server and the monitoring roles run as this user.
The Cloudera Manager keytab file must be named cmf.keytab since that name is hard-coded in Cloudera Manager.
Apache Accumulo (Accumulo 1.4.3 and higher) accumulo accumulo Accumulo processes run as this user.
Apache Avro   No special users.
Apache Flume (CDH 4, CDH 5) flume flume The sink that writes to HDFS as this user must have write privileges.
Apache HBase (CDH 4, CDH 5) hbase hbase The Master and the RegionServer processes run as this user.
HDFS (CDH 4, CDH 5) hdfs hdfs, hadoop The NameNode and DataNodes run as this user, and the HDFS root directory as well as the directories used for edit logs should be owned by it.
Apache Hive (CDH 4, CDH 5) hive hive

The HiveServer2 process and the Hive Metastore processes run as this user.

A user must be defined for Hive access to its Metastore DB (for example, MySQL or Postgres) but it can be any identifier and does not correspond to a Unix uid. This is javax.jdo.option.ConnectionUserName in hive-site.xml.

Apache HCatalog (CDH 4.2 and higher, CDH 5) hive hive

The WebHCat service (for REST access to Hive functionality) runs as the hive user.

HttpFS (CDH 4, CDH 5) httpfs httpfs

The HttpFS service runs as this user. See HttpFS Security Configuration for instructions on how to generate the merged httpfs-http.keytab file.

Hue (CDH 4, CDH 5) hue hue

Hue services run as this user.

Hue Load Balancer (Cloudera Manager 5.5 and higher) apache apache The Hue Load balancer has a dependency on the apache2 package that uses the apache user name. Cloudera Manager does not run processes using this user ID.
Cloudera Impala (CDH 4.1 and higher, CDH 5) impala impala, hive Impala services run as this user.
Apache Kafka (Cloudera Distribution of Kafka 1.2.0) kafka kafka Kafka services run as this user.
Java KeyStore KMS (CDH 5.2.1 and higher) kms kms The Java KeyStore KMS service runs as this user.
Key Trustee KMS (CDH 5.3 and higher) kms kms The Key Trustee KMS service runs as this user.
Key Trustee Server (CDH 5.4 and higher) keytrustee keytrustee The Key Trustee Server service runs as this user.
Kudu kudu kudu Kudu services run as this user.
Llama (CDH 5) llama llama Llama runs as this user.
Apache Mahout   No special users.
MapReduce (CDH 4, CDH 5) mapred mapred, hadoop Without Kerberos, the JobTracker and tasks run as this user. The LinuxTaskController binary is owned by this user for Kerberos.
Apache Oozie (CDH 4, CDH 5) oozie oozie The Oozie service runs as this user.
Parquet   No special users.
Apache Pig   No special users.
Cloudera Search (CDH 4.3 and higher, CDH 5) solr solr The Solr processes run as this user.
Apache Spark (CDH 5) spark spark The Spark History Server process runs as this user.
Apache Sentry (incubating) (CDH 5.1 and higher) sentry sentry The Sentry service runs as this user.
Apache Sqoop (CDH 4, CDH 5) sqoop sqoop This user is only for the Sqoop1 Metastore, a configuration option that is not recommended.
Apache Sqoop2 (CDH 4.2 and higher, CDH 5) sqoop2 sqoop, sqoop2 The Sqoop2 service runs as this user.
Apache Whirr   No special users.
YARN (CDH 4, CDH 5) yarn yarn, hadoop Without Kerberos, all YARN services and applications run as this user. The LinuxContainerExecutor binary is owned by this user for Kerberos.
Apache ZooKeeper (CDH 4, CDH 5) zookeeper zookeeper The ZooKeeper processes run as this user. It is not configurable.

Keytabs and Keytab File Permissions

Clusters Managed by Cloudera Manager
Component (Unix User ID) Service Kerberos Principals Filename (*.keytab) Keytab File Owner Keytab File Group File Permission (octal)
Cloudera Manager (cloudera-scm) NA cloudera-scm cmf cloudera-scm cloudera-scm 600
Cloudera Management Service (cloudera-scm) cloudera-mgmt- REPORTSMANAGER cloudera-scm hdfs cloudera-scm cloudera-scm 600
cloudera-mgmt- ACTIVITYMONITOR
cloudera-mgmt- SERVICEMONITOR
cloudera-mgmt- HOSTMONITOR
Apache Accumulo (accumulo) accumulo16-ACCUMULO16_MASTER accumulo accumulo16 cloudera-scm cloudera-scm 600
accumulo16-ACCUMULO16_TRACER
accumulo16-ACCUMULO16_MONITOR
accumulo16-ACCUMULO16_GC
accumulo16-ACCUMULO16_TSERVER
Flume (flume) flume-AGENT flume flume cloudera-scm cloudera-scm 600
HBase (hbase) hbase-HBASETHRIFTSERVER HTTP HTTP cloudera-scm cloudera-scm 600
hbase-REGIONSERVER hbase hbase
hbase-HBASERESTSERVER
hbase-MASTER
HDFS (hdfs) hdfs-NAMENODE hdfs, HTTP hdfs cloudera-scm cloudera-scm 600
hdfs-DATANODE
hdfs- SECONDARYNAMENODE
Hive (hive) hive-HIVESERVER2 hive hive cloudera-scm cloudera-scm 600
hive-WEBHCAT HTTP HTTP
hive-HIVEMETASTORE hive hive
HttpFS (httpfs) hdfs-HTTPFS httpfs httpfs cloudera-scm cloudera-scm 600
Hue (hue) hue-KT_RENEWER hue hue cloudera-scm cloudera-scm 600
Impala (impala) impala-STATESTORE impala impala cloudera-scm cloudera-scm 600
impala-CATALOGSERVER
impala-IMPALAD
Java KeyStore KMS (kms) kms-KMS HTTP kms cloudera-scm cloudera-scm 600
Apache Kafka (kafka) kafka-KAFKA_BROKER kafka kafka kafka kafka 600
Key Trustee KMS (kms) keytrustee-KMS_KEYTRUSTEE HTTP keytrustee cloudera-scm cloudera-scm 600
Llama (llama) impala-LLAMA llama, HTTP llama cloudera-scm cloudera-scm 600
MapReduce (mapred) mapreduce-JOBTRACKER mapred, HTTP mapred cloudera-scm cloudera-scm 600
mapreduce- TASKTRACKER
Oozie (oozie) oozie-OOZIE_SERVER oozie, HTTP oozie cloudera-scm cloudera-scm 600
Search (solr) solr-SOLR_SERVER solr, HTTP solr cloudera-scm cloudera-scm 600
Sentry (sentry) sentry-SENTRY_SERVER sentry sentry cloudera-scm cloudera-scm 600
Spark (spark) spark_on_yarn- SPARK_YARN_HISTORY_SERVER spark spark cloudera-scm cloudera-scm 600
YARN (yarn) yarn-NODEMANAGER yarn, HTTP yarn cloudera-scm cloudera-scm 644
yarn- RESOURCEMANAGER 600
yarn-JOBHISTORY 600
ZooKeeper (zookeeper) zookeeper-server zookeeper zookeeper cloudera-scm cloudera-scm 600
CDH Clusters Not Managed by Cloudera Manager
Component (Unix User ID) Service Kerberos Principals Filename (*.keytab) Keytab File Owner Keytab File Group File Permission (octal)
Apache Accumulo (accumulo) accumulo16-ACCUMULO16_MASTER accumulo accumulo16 accumulo accumulo 600
accumulo16-ACCUMULO16_TRACER
accumulo16-ACCUMULO16_MONITOR
accumulo16-ACCUMULO16_GC
accumulo16-ACCUMULO16_TSERVER
Flume (flume) flume-AGENT flume flume flume flume 600
HBase (hbase) hbase-HBASETHRIFTSERVER HTTP HTTP hbase hbase 600
hbase-REGIONSERVER hbase hbase
hbase-HBASERESTSERVER
hbase-MASTER
HDFS (hdfs) hdfs-NAMENODE hdfs, HTTP hdfs hdfs hdfs 600
hdfs-DATANODE
hdfs- SECONDARYNAMENODE
Hive (hive) hive-HIVESERVER2 hive hive hive hive 600
hive-WEBHCAT HTTP HTTP
hive-HIVEMETASTORE hive hive
HttpFS (httpfs) hdfs-HTTPFS httpfs httpfs httpfs httpfs 600
Hue (hue) hue-KT_RENEWER hue hue hue hue 600
Impala (impala) impala-STATESTORE impala impala impala impala 600
impala-CATALOGSERVER
impala-IMPALAD
Llama (llama) impala-LLAMA llama, HTTP llama llama llama 600
Java KeyStore KMS (kms) kms-KMS HTTP kms kms kms 600
Apache Kafka (kafka) kafka-KAFKA_BROKER kafka kafka kafka kafka 600
Key Trustee KMS (kms) kms-KEYTRUSTEE HTTP kms kms kms 600
MapReduce (mapred) mapreduce-JOBTRACKER mapred, HTTP mapred mapred hadoop 600
mapreduce- TASKTRACKER
Oozie (oozie) oozie-OOZIE_SERVER oozie, HTTP oozie oozie oozie 600
Search (solr) solr-SOLR_SERVER solr, HTTP solr solr solr 600
Sentry (sentry) sentry-SENTRY_SERVER sentry sentry sentry sentry 600
Spark (spark) spark_on_yarn- SPARK_YARN_HISTORY_SERVER spark spark spark spark 600
YARN (yarn) yarn-NODEMANAGER yarn, HTTP yarn yarn hadoop 644
yarn- RESOURCEMANAGER 600
yarn-JOBHISTORY 600
ZooKeeper (zookeeper) zookeeper-server zookeeper zookeeper zookeeper zookeeper 600