The Sentry Service

The Sentry service is a RPC server that stores the authorization metadata in an underlying relational database and provides RPC interfaces to retrieve and manipulate privileges. It supports secure access to services using Kerberos. The service serves authorization metadata from the database backed storage; it does not handle actual privilege validation. The Hive, Impala, and Solr services are clients of this service and will enforce Sentry privileges when configured to use Sentry.

The motivation behind introducing a new Sentry service is to make it easier to handle user privileges than the existing policy file approach. Providing a service instead of a file allows you to use the more traditional GRANT/REVOKE statements to modify privileges.

The rest of this topic will walk you through the prerequisites for Sentry, basic terminology, and the Sentry privilege model.

Prerequisites

  • CDH 5.1.x (or higher) managed by Cloudera Manager 5.1.x (or higher). See the Cloudera Manager Administration Guide and Cloudera Installation for instructions.
  • HiveServer2 and the Hive Metastore running with strong authentication. For HiveServer2, strong authentication is either Kerberos or LDAP. For the Hive Metastore, only Kerberos is considered strong authentication (to override, see Securing the Hive Metastore).
  • Impala 1.4.0 (or higher) running with strong authentication. With Impala, either Kerberos or LDAP can be configured to achieve strong authentication.
  • Cloudera Search for CDH 5.1.0 or higher. Solr supports using Sentry beginning with CDH 5.1.0. Different functionality is added at different releases:
    • Sentry with policy files is added in CDH 5.1.0.
    • Sentry with config support is added in CDH 5.5.0.
    • Sentry with database-backed Sentry service is added with CDH 5.8.0.
  • Implement Kerberos authentication on your cluster. For instructions, see Enabling Kerberos Authentication Using the Wizard.

Terminology

  • An object is an entity protected by Sentry's authorization rules. The objects supported in the current release are server, database, table, URI, collection, and config.
  • A role is a collection of rules for accessing a given object.
  • A privilege is granted to a role to govern access to an object. With CDH 5.5, Sentry allows you to assign the SELECT privilege to columns (only for Hive and Impala). Supported privileges are:
    Valid privilege types and the objects they apply to
    Privilege Object
    INSERT DB, TABLE
    SELECT SERVER, DB, TABLE, COLUMN
    UPDATE COLLECTION, CONFIG
    QUERY COLLECTION, CONFIG
    ALL SERVER, TABLE, DB, URI, COLLECTION, CONFIG
  • A user is an entity that is permitted by the authentication subsystem to access the service. This entity can be a Kerberos principal, an LDAP userid, or an artifact of some other supported pluggable authentication system.
  • A group connects the authentication system with the authorization system. It is a collection of one or more users who have been granted one or more authorization roles. Sentry allows a set of roles to be configured for a group.
  • A configured group provider determines a user’s affiliation with a group. The current release supports HDFS-backed groups and locally configured groups.

Privilege Model

Sentry uses a role-based privilege model with the following characteristics.
  • Allows any user to execute show function, desc function, and show locks.
  • Allows the user to see only those tables, databases, collections, configs for which the user has privileges.
  • Requires a user to have the necessary privileges on the URI to execute HiveQL operations that specify a location. Examples of such operations include LOAD, IMPORT, and EXPORT.
  • Privileges granted on URIs are recursively applied to all subdirectories. That is, privileges only need to be granted on the parent directory.
  • CDH 5.5 introduces column-level access control for tables in Hive and Impala. Previously, Sentry supported privilege granularity only down to a table. Hence, if you wanted to restrict access to a column of sensitive data, the workaround would be to first create view for a subset of columns, and then grant privileges on that view. To reduce the administrative overhead associated with such an approach, Sentry now allows you to assign the SELECT privilege on a subset of columns in a table.

For more information, see Authorization Privilege Model for Hive and Impala.

User to Group Mapping

Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)

Group mappings in Sentry can be summarized as in the figure below.


The Sentry service only uses HadoopUserGroup mappings. You can refer Configuring LDAP Group Mappings for details on configuring LDAP group mappings in Hadoop.

Authorization Privilege Model for Hive and Impala

Privileges can be granted on different objects in the Hive warehouse. Any privilege that can be granted is associated with a level in the object hierarchy. If a privilege is granted on a container object in the hierarchy, the base object automatically inherits it. For instance, if a user has ALL privileges on the database scope, then (s)he has ALL privileges on all of the base objects contained within that scope.

Object Hierarchy in Hive

Server
     URI
     Database
         Table
             Partition
             Columns
         View
Valid privilege types and objects they apply to
Privilege Object
INSERT DB, TABLE
SELECT DB, TABLE, VIEW, COLUMN
ALL SERVER, TABLE, DB, URI
Privilege hierarchy
Base Object Granular privileges on object Container object that contains the base object Privileges on container object that implies privileges on the base object
DATABASE ALL SERVER ALL
TABLE INSERT DATABASE ALL
TABLE SELECT DATABASE ALL
COLUMN SELECT DATABASE ALL
VIEW SELECT DATABASE ALL
Privilege table for Hive & Impala operations
Operation Scope Privileges Required URI
CREATE DATABASE SERVER ALL  
DROP DATABASE DATABASE ALL  
CREATE TABLE DATABASE ALL  
DROP TABLE TABLE ALL  
CREATE VIEW

-This operation is allowed if you have column-level SELECT access to the columns being used.

DATABASE; SELECT on TABLE; ALL  
ALTER VIEW

-This operation is allowed if you have column-level SELECT access to the columns being used.

VIEW/TABLE ALL  
DROP VIEW VIEW/TABLE ALL  
ALTER TABLE .. ADD COLUMNS TABLE ALL  
ALTER TABLE .. REPLACE COLUMNS TABLE ALL  
ALTER TABLE .. CHANGE column TABLE ALL  
ALTER TABLE .. RENAME TABLE ALL  
ALTER TABLE .. SET TBLPROPERTIES TABLE ALL  
ALTER TABLE .. SET FILEFORMAT TABLE ALL  
ALTER TABLE .. SET LOCATION TABLE ALL URI
ALTER TABLE .. ADD PARTITION TABLE ALL  
ALTER TABLE .. ADD PARTITION location TABLE ALL URI
ALTER TABLE .. DROP PARTITION TABLE ALL  
ALTER TABLE .. PARTITION SET FILEFORMAT TABLE ALL  
SHOW CREATE TABLE TABLE SELECT/INSERT  
SHOW PARTITIONS TABLE SELECT/INSERT  
SHOW TABLES

-Output includes all the tables for which the user has table-level privileges and all the tables for which the user has some column-level privileges.

TABLE SELECT/INSERT  
SHOW GRANT ROLE

-Output includes an additional field for any column-level privileges.

TABLE SELECT/INSERT  
DESCRIBE TABLE

-Output shows all columns if the user has table level-privileges or SELECT privilege on at least one table column

TABLE SELECT/INSERT  
LOAD DATA TABLE INSERT URI
SELECT

-You can grant the SELECT privilege on a view to give users access to specific columns of a table they do not otherwise have access to.

-See Column-level Authorization for details on allowed column-level operations.

VIEW/TABLE; COLUMN SELECT  
INSERT OVERWRITE TABLE TABLE INSERT  
CREATE TABLE .. AS SELECT

-This operation is allowed if you have column-level SELECT access to the columns being used.

DATABASE; SELECT on TABLE ALL  
USE <dbName> Any    
CREATE FUNCTION SERVER ALL  
ALTER TABLE .. SET SERDEPROPERTIES TABLE ALL  
ALTER TABLE .. PARTITION SET SERDEPROPERTIES TABLE ALL  
Hive-Only Operations
INSERT OVERWRITE DIRECTORY TABLE INSERT URI
Analyze TABLE TABLE SELECT + INSERT  
IMPORT TABLE DATABASE ALL URI
EXPORT TABLE TABLE SELECT URI
ALTER TABLE TOUCH TABLE ALL  
ALTER TABLE TOUCH PARTITION TABLE ALL  
ALTER TABLE .. CLUSTERED BY SORTED BY TABLE ALL  
ALTER TABLE .. ENABLE/DISABLE TABLE ALL  
ALTER TABLE .. PARTITION ENABLE/DISABLE TABLE ALL  
ALTER TABLE .. PARTITION.. RENAME TO PARTITION TABLE ALL  
MSCK REPAIR TABLE TABLE ALL  
ALTER DATABASE DATABASE ALL  
DESCRIBE DATABASE DATABASE SELECT/INSERT  
SHOW COLUMNS

-Output for this operation filters columns to which the user does not have explicit SELECT access

TABLE SELECT/INSERT  
CREATE INDEX TABLE ALL  
DROP INDEX TABLE ALL  
SHOW INDEXES TABLE SELECT/INSERT  
GRANT PRIVILEGE Allowed only for Sentry admin users    
REVOKE PRIVILEGE Allowed only for Sentry admin users    
SHOW GRANT Allowed only for Sentry admin users    
SHOW TBLPROPERTIES TABLE SELECT/INSERT  
DESCRIBE TABLE .. PARTITION TABLE SELECT/INSERT  
ADD JAR Not Allowed    
ADD FILE Not Allowed    
DFS Not Allowed    
Impala-Only Operations
EXPLAIN TABLE; COLUMN SELECT  
INVALIDATE METADATA SERVER ALL  
INVALIDATE METADATA <table name> TABLE SELECT/INSERT  
REFRESH <table name> or REFRESH <table name> PARTITION (<partition_spec>) TABLE SELECT/INSERT  
DROP FUNCTION SERVER ALL  
COMPUTE STATS TABLE ALL  

Authorization Privilege Model for Solr

The tables below refer to the request handlers defined in the generated solrconfig.xml.secure. If you are not using this configuration file, the below may not apply.

admin is a special collection in sentry used to represent administrative actions. A non-administrative request may only require privileges on the collection or config on which the request is being performed. This is called either collection1 or config1 in this appendix. An administrative request may require privileges on both the admin collection and collection1. This is denoted as admin, collection1 in the tables below.

Privilege table for non-administrative request handlers
Request Handler Required Collection Privilege Collections that Require Privilege
select QUERY collection1
query QUERY collection1
get QUERY collection1
browse QUERY collection1
tvrh QUERY collection1
clustering QUERY collection1
terms QUERY collection1
elevate QUERY collection1
analysis/field QUERY collection1
analysis/document QUERY collection1
update UPDATE collection1
update/json UPDATE collection1
update/csv UPDATE collection1
Privilege table for collections admin actions
Collection Action Required Collection Privilege Collections that Require Privilege
create UPDATE admin, collection1
delete UPDATE admin, collection1
reload UPDATE admin, collection1
createAlias UPDATE admin, collection1
deleteAlias UPDATE admin, collection1
syncShard UPDATE admin, collection1
splitShard UPDATE admin, collection1
deleteShard UPDATE admin, collection1
Privilege table for core admin actions
Collection Action Required Collection Privilege Collections that Require Privilege
create UPDATE admin, collection1
rename UPDATE admin, collection1
load UPDATE admin, collection1
unload UPDATE admin, collection1
status UPDATE admin, collection1
persist UPDATE admin
reload UPDATE admin, collection1
swap UPDATE admin, collection1
mergeIndexes UPDATE admin, collection1
split UPDATE admin, collection1
prepRecover UPDATE admin, collection1
requestRecover UPDATE admin, collection1
requestSyncShard UPDATE admin, collection1
requestApplyUpdates UPDATE admin, collection1
Privilege table for Info and AdminHandlers
Request Handler Required Collection Privilege Collections that Require Privilege
LukeRequestHandler QUERY admin
SystemInfoHandler QUERY admin
SolrInfoMBeanHandler QUERY admin
PluginInfoHandler QUERY admin
ThreadDumpHandler QUERY admin
PropertiesRequestHandler QUERY admin
LogginHandler QUERY, UPDATE (or *) admin
ShowFileRequestHandler QUERY admin
Privilege table for Config Admin actions
Config Action Required Collection Privilege Collections that Require Privilege Required Config Privilege Configs that Require Privilege
CREATE UPDATE admin * config1
DELETE UPDATE admin * config1