Setting Up Hive Authorization with Sentry
Requirements for Using Sentry for Hive and Impala Authorization
- CDH 4.3.0 or later and, if using Impala, Impala 1.1 or later. Auditing of authentication failures is supported only with CDH4.4 and Impala 1.1.1 or later.
- HiveServer2 running with strong authentication (Kerberos or LDAP).
- A secure Hadoop cluster.
- The Hive warehouse directory (/user/hive/warehouse or the path you have specified as hive.metastore.warehouse.dir in your Hive configuration) must be owned by the Hive user and group.
- Permissions on the warehouse directory must be set as follows:
- 770 on the directory itself (for example, /user/hive/warehouse)
- 770 on all subdirectories (for example, /user/hive/warehouse/mysubdir)
- All files and directories should be owned by hive:hive
$ sudo -u hdfs hdfs dfs -chmod -R 770 /user/hive/warehouse $ sudo -u hdfs hdfs dfs -chown -R hive:hive /user/hive/warehouse
Configuring Sentry Authorization for Hive
The following instructions assume that the Sentry parcel or package has been installed.
Sentry authorization is not set up automatically by the Cloudera Manager installation or upgrade wizards. To enable authorization, do the following:
- In the Cloudera Manager Admin console, go to the HiveServer2 role configuration, and disable impersonation.
- From the Admin console, select the Hive service.
- Under the Configuration menu, select View and Edit.
- Under the HiveServer2 role group, uncheck the HiveServer2 Enable Impersonation property, and Save Changes.
- Create the policy file sentry-provider.ini as an HDFS file.
Please read the information in the Sentry Guide, specifically the section on the Policy file. The file must be owned by the hive user in the hive group, with perms=640.
By default Cloudera Manager assumes the file is in /user/hive/sentry. The path is configurable under the Configuration settings for the Hive service: under the Service-Wide category, select Sentry and modify the path in the Sentry Global Policy File property.
The following is an example of a simple policy file:
[groups] ann=default_admin bob=sample_reader joe=admin_role [roles] # can read both sample tables sample_reader = server=server1->db=default->table=sample_07->action=select, \ server=server1->db=default->table=sample_08->action=select # implies everything on server1, default db default_admin = server=server1->db=default # implies everything on server1 admin_role = server=server1
- Make sure the Hive warehouse directory ownership and permissions are as described in the requirements section above.
- Under the MapReduce service, TaskTracker role group(s) and/or the YARN service NodeManager role group(s), set the Minimum User ID for Job Submission to 0. Note that you must do this for every TaskTracker or NodeManager role group for the MapReduce or YARN service that is associated with Hive, if more than one exists.
- Select the MapReduce or YARN service and from the Configuration menu select View and Edit.
- Under a TaskTracker or NodeManager role group go to the Security category.
- Change the Minimum User ID for Job Submission to zero (the default is 1000) and Save Changes.
- Do this for each TaskTracker role group or NodeManager role group. (Often there are different role groups for the TaskTracker or NodeManager roles colocated on the system with the JobTracker or ResourceManager roles, vs. TaskTracker or NodeManager roles running on slave nodes.)
- Restart your MapReduce or YARN service.
- For your Hive service, under its configuration settings, go to the Service-Wide category, Sentry section, check Enable Sentry Authorization, then Save Changes.
- Restart the Hive service.
Enabling Sentry for Impala
To enable Sentry authorization for Impala after completing the configuration steps above:
- Go to the Impala service, and from the Configuration menu select View and Edit.
- Under the Service-Wide category, go to the Sentry section.
- Check Enable Sentry Authorization, then Save Changes.
- Restart the Impala service.