This is the documentation for Cloudera Manager 4.8.3.
Documentation for other versions is available at Cloudera Documentation.

Setting Up Hive Authorization with Sentry

Sentry enables role-based, fine-grained authorization for HiveServer2. It provides classic database-style authorization for Hive and Cloudera Impala. For detailed information about Sentry, see the CDH 4 Sentry Guide.

When using Sentry, you must use HiveServer2 or Impala to access Hive tables. You can also use Hue Beeswax if Beeswax is configured to use HiveServer2. You cannot use the Hive CLI or WebHCat with Sentry.

  Important: If you enable Sentry authorization, you should enable it on both Hive and Impala, not just for one or the other.

Continue reading:

Requirements for Using Sentry for Hive and Impala Authorization

  Note: In order to use Sentry with CDH4.3, you will need to install Sentry manually before you begin the following procedure; it is not included in the CDH4.3 parcel or package. Sentry is included with CDH4.4.0 or later.
  • CDH 4.3.0 or later and, if using Impala, Impala 1.1 or later. Auditing of authentication failures is supported only with CDH4.4 and Impala 1.1.1 or later.
  • HiveServer2 running with strong authentication (Kerberos or LDAP).
  • A secure Hadoop cluster.
In addition, make sure that the following are true:
  • The Hive warehouse directory (/user/hive/warehouse or the path you have specified as hive.metastore.warehouse.dir in your Hive configuration) must be owned by the Hive user and group.
  • Permissions on the warehouse directory must be set as follows:
    • 770 on the directory itself (for example, /user/hive/warehouse)
    • 770 on all subdirectories (for example, /user/hive/warehouse/mysubdir)
    • All files and directories should be owned by hive:hive
    For example:
    $ sudo -u hdfs hdfs dfs -chmod -R 770 /user/hive/warehouse
    $ sudo -u hdfs hdfs dfs -chown -R hive:hive /user/hive/warehouse

Configuring Sentry Authorization for Hive

The following instructions assume that the Sentry parcel or package has been installed.

  Note: To enable Sentry for Impala, follow the steps below, then Enable Sentry Authorization under the Impala configuration settings.
Sentry authorization is not set up automatically by the Cloudera Manager installation or upgrade wizards. To enable authorization, do the following:
  1. In the Cloudera Manager Admin console, go to the HiveServer2 role configuration, and disable impersonation.
    1. Go to the Hive service.
    2. Select Configuration > View and Edit.
    3. Under the HiveServer2 role group, uncheck the HiveServer2 Enable Impersonation property, and Save Changes.
  2. Create the policy file sentry-provider.ini as an HDFS file.

    Please read the information in the Sentry Guide, specifically the section on the Policy file. The file must be owned by the hive user in the hive group, with perms=640.

    By default Cloudera Manager assumes the file is in /user/hive/sentry. The path is configurable under the Configuration settings for the Hive service: under the Service-Wide category, select Sentry and modify the path in the Sentry Global Policy File property.

    The following is an example of a simple policy file:

    [groups]
    ann=default_admin
    bob=sample_reader
    joe=admin_role
    [roles]
    # can read both sample tables
    sample_reader = server=server1->db=default->table=sample_07->action=select, \
    server=server1->db=default->table=sample_08->action=select
    # implies everything on server1, default db
    default_admin = server=server1->db=default
    # implies everything on server1
    admin_role = server=server1
  3. Make sure the Hive warehouse directory ownership and permissions are as described in the requirements section above.
  4. Under the MapReduce service, TaskTracker role group(s) and/or the YARN service NodeManager role group(s), set the Minimum User ID for Job Submission to 0. Note that you must do this for every TaskTracker or NodeManager role group for the MapReduce or YARN service that is associated with Hive, if more than one exists.
    1. Select the MapReduce or YARN service and from the Configuration menu select View and Edit.
    2. Under a TaskTracker or NodeManager role group go to the Security category.
    3. Change the Minimum User ID for Job Submission to zero (the default is 1000) and Save Changes.
    4. Do this for each TaskTracker role group or NodeManager role group. (Often there are different role groups for the TaskTracker or NodeManager roles colocated on the system with the JobTracker or ResourceManager roles, vs. TaskTracker or NodeManager roles running on slave nodes.)
  5. Restart your MapReduce or YARN service.
  6. For your Hive service, under its configuration settings, go to the Service-Wide category, Sentry section, check Enable Sentry Authorization, then Save Changes.
  7. Restart the Hive service.

Configuring Group Access to the Hive Metastore

You can configure the Hive Metastore to reject connections from users not listed in the Hive group proxy list. If you don't configure this override, the Hive Metastore will use the value in the core-site HDFS configuration. To configure the Hive group proxy list:
  1. Go to the Hive service.
  2. Select Configuration > View and Edit.
  3. Click the Proxy category.
  4. In the Hive Metastore Access Control and Proxy User Groups Override property, specify a list of groups whose users are allowed to access the Hive Metastore. If you do not specify "*" (wildcard), you will be warned if the groups do not include hive and impala (if the Impala service is configured) in the list of groups.
  5. Click Save Changes.
  6. Restart the Hive service.

Enabling Sentry for Impala

To enable Sentry authorization for Impala after completing the configuration steps above:
  1. Go to the Impala service.
  2. Select Configuration > View and Edit.
  3. Under the Service-Wide category, go to the Sentry section.
  4. Check Enable Sentry Authorization, then Save Changes.
  5. Restart the Impala service.