Enabling Sentry Authorization for Search using the Command Line

Sentry enables role-based, fine-grained authorization for Cloudera Search. Sentry can apply a range of restrictions to various tasks, such as accessing data, managing configurations through config objects, or creating collections. Restrictions are consistently applied, regardless of the way users attempt to complete actions. For example, restricting access to data in a collection restricts that access whether queries come from the command line, from a browser, Hue, or through the admin console.

  • You can use either Cloudera Manager or the following command-line instructions to complete this configuration.
  • This information applies specifically to CDH 5.7.x. If you use an earlier version of CDH, see the documentation for that version located at Cloudera Documentation.

For information on enabling Sentry authorization using Cloudera Manager, see Configuring Sentry Policy File Authorization Using Cloudera Manager.

Follow the instructions below to configure Sentry under CDH 4.5 or higher or CDH 5. Sentry is included in the Search installation.

This document describes configuring Sentry for Cloudera Search. For information about alternate ways to configure Sentry or for information about installing Sentry for other services, see:

Using Roles and Privileges with Sentry

Sentry uses a role-based privilege model. A role is a set of rules for accessing a given Solr collection or Solr config. Access to each collection is governed by three privileges: Query, Update, and *. The wildcard (*) indicates all privileges. In contract, access to each config is governed by a single privilege *, meaning all privileges.

  • A rule for the Query privilege on collection called logs would be formulated as follows:
    collection=logs->action=Query
  • A rule for the * privilege, meaning all privileges, on the config called myConfig would be formulated as follows:
    config=myConfig->action=*

    No action implies * and * is the only valid action. Because config objects only support *, the following config privilege is invalid:

    config=myConfig->action=Update
Note that config objects cannot be combined with collection objects in a single privilege. For example, the following combinations are illegal:
  • config=myConfig->collection=myCollection->action=*
  • collection=myCollection->config=myConfig
You may specify these privileges separately. For example:
myRole = collection=myCollection->action=QUERY, config=myConfig->action=*
A role can contain multiple such rules, separated by commas. For example the engineer_role might contain the Query privilege for hive_logs and hbase_logs collections, and the Update privilege for the current_bugs collection. You would specify this as follows:
engineer_role = collection=hive_logs->action=Query, collection=hbase_logs->action=Query, collection=current_bugs->action=Update

Using Users and Groups with Sentry

  • A user is an entity that is permitted by the Kerberos authentication system to access the Search service.
  • A group connects the authentication system with the authorization system. It is a set of one or more users who have been granted one or more authorization roles. Sentry allows a set of roles to be configured for a group.
  • A configured group provider determines a user’s affiliation with a group. The current release supports HDFS-backed groups and locally configured groups. For example,
    dev_ops = dev_role, ops_role

Here the group dev_ops is granted the roles dev_role and ops_role. The members of this group can complete searches that are allowed by these roles.

User to Group Mapping

You can configure Sentry to use either Hadoop groups or groups defined in the policy file.

To configure Hadoop groups:

Set the sentry.provider property in sentry-site.xml to org.apache.sentry.provider.file.HadoopGroupResourceAuthorizationProvider.

By default, this uses local shell groups. See the Group Mapping section of the HDFS Permissions Guide for more information.

In this case, Sentry uses the Hadoop configuration described in Configuring LDAP Group Mappings. Cloudera Manager automatically uses this configuration. In a deployment not managed by Cloudera Manager, manually set these configuration parameters parameters in the hadoop-conf file that is passed to Solr.

OR

To configure local groups:

  1. Define local groups in a [users] section of the Sentry Policy file. For example:
    [users]
    user1 = group1, group2, group3
    user2 = group2, group3
  2. In sentry-site.xml, set search.sentry.provider as follows:
    <property>
        <name>sentry.provider</name>
        <value>org.apache.sentry.provider.file.LocalGroupResourceAuthorizationProvider</value>
      </property>

Using Policy Files with Sentry

The sections that follow contain notes on creating and maintaining the policy file.

Storing the Policy File

Considerations for storing the policy file(s) include:

  1. Replication count - Because the file is read for each query, you should increase this; 10 is a reasonable value.
  2. Updating the file - Updates to the file are only reflected when the Solr process is restarted.

Defining Roles

Keep in mind that role definitions are not cumulative; the newer definition replaces the older one. For example, the following results in role1 having privilege2, not privilege1 and privilege2.
role1 = privilege1
role1 = privilege2

Sample Sentry Configuration

This section provides a sample configuration.

Policy File

The following is an example of a CDH Search policy file. The sentry-provider.ini would exist in an HDFS location such as hdfs://ha-nn-uri/user/solr/sentry/sentry-provider.ini. This location must be readable by Solr.

sentry-provider.ini

[groups]
# Assigns each Hadoop group to its set of roles
engineer = engineer_role
ops = ops_role
dev_ops = engineer_role, ops_role
hbase_admin = hbase_admin_role

[roles]
# The following grants all access to source_code.
# "collection = source_code" can also be used as syntactic
# sugar for "collection = source_code->action=*"
engineer_role = collection = source_code->action=*

# The following imply more restricted access.
ops_role = collection = hive_logs->action=Query
dev_ops_role = collection = hbase_logs->action=Query

#give hbase_admin_role the ability to create/delete/modify the hbase_logs collection
#as well as to update the config for the hbase_logs collection, called hbase_logs_config.
hbase_admin_role = collection=admin->action=*, collection=hbase_logs->action=*, config=hbase_logs_config->action=*

Sentry Configuration File

Sentry stores the configuration as well as privilege policies in files. The sentry-site.xml file contains configuration options such as privilege policy file location. The Policy File contains the privileges and groups. It has a .ini file format and should be stored on HDFS.

The following is an example of a sentry-site.xml file.

sentry-site.xml

<configuration>
  <property>
    <name>hive.sentry.provider</name>
    <value>org.apache.sentry.provider.file.HadoopGroupResourceAuthorizationProvider</value>
  </property>

  <property>
    <name>sentry.solr.provider.resource</name>
    <value>/path/to/authz-provider.ini</value>
    <!-- 
        If the HDFS configuration files (core-site.xml, hdfs-site.xml)
        pointed to by SOLR_HDFS_CONFIG in /etc/default/solr
        point to HDFS, the path will be in HDFS;
        alternatively you could specify a full path, 
        e.g.:hdfs://namenode:port/path/to/authz-provider.ini
    -->
  </property>