HiveServer2 Security Configuration
- Kerberos authentication
- LDAP authentication
Starting with CDH 5.7, clusters running LDAP-enabled HiveServer2 deployments also accept Kerberos authentication. This ensures that users are not forced to enter usernames/passwords manually, and are able to take advantage of the multiple authentication schemes SASL offers. In CDH 5.6 and lower, HiveServer2 stops accepting delegation tokens when any alternate authentication is enabled.
Kerberos authentication is supported between the Thrift client and HiveServer2, and between HiveServer2 and secure HDFS. LDAP authentication is supported only between the Thrift client and HiveServer2.
To configure HiveServer2 to use one of these authentication modes, configure the hive.server2.authentication configuration property.
- Enabling Kerberos Authentication for HiveServer2
- Using LDAP Username/Password Authentication with HiveServer2
- Configuring LDAPS Authentication with HiveServer2
- Pluggable Authentication
- Trusted Delegation with HiveServer2
- HiveServer2 Impersonation
- Securing the Hive Metastore
- Disabling the Hive Security Configuration
Enabling Kerberos Authentication for HiveServer2
If you configure HiveServer2 to use Kerberos authentication, HiveServer2 acquires a Kerberos ticket during startup. HiveServer2 requires a principal and keytab file specified in the configuration. Client applications (for example, JDBC or Beeline) must have a valid Kerberos ticket before initiating a connection to HiveServer2.
Configuring HiveServer2 for Kerberos-Secured Clusters
To enable Kerberos Authentication for HiveServer2, add the following properties in the /etc/hive/conf/hive-site.xml file:
<property> <name>hive.server2.authentication</name> <value>KERBEROS</value> </property> <property> <name>hive.server2.authentication.kerberos.principal</name> <value>hive/_HOST@YOUR-REALM.COM</value> </property> <property> <name>hive.server2.authentication.kerberos.keytab</name> <value>/etc/hive/conf/hive.keytab</value> </property>
- hive.server2.authentication is a client-facing property that controls the type of authentication HiveServer2 uses for connections to clients. In this case, HiveServer2 uses Kerberos to authenticate incoming clients.
- The _HOST@YOUR-REALM.COM value in the example above is the Kerberos principal for the host where HiveServer2 is running. The string _HOST in the properties is replaced at run time by the fully qualified domain name (FQDN) of the host machine where the daemon is running. Reverse DNS must be working on all the hosts configured this way. Replace YOUR-REALM.COM with the name of the Kerberos realm your Hadoop cluster is in.
- The /etc/hive/conf/hive.keytab value in the example above is a keytab file for that principal.
If you configure HiveServer2 to use both Kerberos authentication and secure impersonation, JDBC clients and Beeline can specify an alternate session user. If these clients have proxy user privileges, HiveServer2 impersonates the alternate user instead of the one connecting. The alternate user can be specified by the JDBC connection string proxyUser=userName
Configuring JDBC Clients for Kerberos Authentication with HiveServer2 (Using the Apache Hive Driver in Beeline)
JDBC-based clients must include principal=<hive.server2.authentication.principal> in the JDBC connection string. For example:
String url = "jdbc:hive2://node1:10000/default;principal=hive/HiveServer2Host@YOUR-REALM.COM" Connection con = DriverManager.getConnection(url);
where hive is the principal configured in hive-site.xml and HiveServer2Host is the host where HiveServer2 is running.
Using Beeline to Connect to a Secure HiveServer2
Use the following command to start beeline and connect to a secure HiveServer2 process. In this example, the HiveServer2 process is running on localhost at port 10000:
$ /usr/lib/hive/bin/beeline beeline> !connect jdbc:hive2://localhost:10000/default;principal=hive/HiveServer2Host@YOUR-REALM.COM 0: jdbc:hive2://localhost:10000/default>
For more information about the Beeline CLI, see Using the Beeline CLI.
For instructions on encrypting communication with the ODBC/JDBC drivers, see Configuring Encrypted Communication Between HiveServer2 and Client Drivers.
Using LDAP Username/Password Authentication with HiveServer2
As an alternative to Kerberos authentication, you can configure HiveServer2 to use user and password validation backed by LDAP. The client sends a username and password during connection initiation. HiveServer2 validates these credentials using an external LDAP service.
Enabling LDAP Authentication with HiveServer2 using Active Directory
For managed clusters, use Cloudera Manager:
- In the Cloudera Manager Admin Console, click Hive in the list of components, and then select the Configuration tab.
- Type "ldap" in the Search text box to locate the LDAP configuration fields.
- Check Enable LDAP Authentication.
- Enter the LDAP URL in the format ldap[s]://<host>:<port>
- Enter the Active Directory Domain for your environment.
- Click Save Changes.
For unmanaged clusters, use the command line:
Add the following properties to the hive-site.xml:
<property> <name>hive.server2.authentication</name> <value>LDAP</value> </property> <property> <name>hive.server2.authentication.ldap.url</name> <value>LDAP_URLL</value> </property> <property> <name>hive.server2.authentication.ldap.Domain</name> <value>AD_DOMAIN_ADDRESS</value> </property>
The LDAP_URL value is the access URL for your LDAP server. For example, ldap[s]://<host>:<port>
Enabling LDAP Authentication with HiveServer2 using OpenLDAP
To enable LDAP authentication using OpenLDAP, include the following properties in hive-site.xml:
<property> <name>hive.server2.authentication</name> <value>LDAP</value> </property> <property> <name>hive.server2.authentication.ldap.url</name> <value>LDAP_URL</value> </property> <property> <name>hive.server2.authentication.ldap.baseDN</name> <value>LDAP_BaseDN</value> </property>
- The LDAP_URL value is the access URL for your LDAP server.
- The LDAP_BaseDN value is the base LDAP DN for your LDAP server; for example, ou=People,dc=example,dc=com.
Configuring JDBC Clients for LDAP Authentication with HiveServer2
The JDBC client requires a connection URL as shown below.
JDBC-based clients must include user=LDAP_Userid;password=LDAP_Password in the JDBC connection string. For example:
String url = "jdbc:hive2://node1:10000/default;user=LDAP_Userid;password=LDAP_Password" Connection con = DriverManager.getConnection(url);
where the LDAP_Userid value is the user ID and LDAP_Password is the password of the client user.
For ODBC Clients, see Cloudera ODBC Driver for Apache Hive.
Enabling LDAP Authentication for HiveServer2 in Hue
|auth_username||LDAP username of Hue user to be authenticated.|
LDAP password of Hue user to be authenticated.
Configuring LDAPS Authentication with HiveServer2
HiveServer2 supports LDAP username/password authentication for clients. Clients send LDAP credentials to HiveServer2 which in turn verifies them against the configured LDAP provider, such as OpenLDAP or Microsoft Active Directory. Most implementations now support LDAPS (LDAP over TLS/SSL), an authentication protocol that uses TLS/SSL to encrypt communication between the LDAP service and its client (in this case, HiveServer2) to avoid sending LDAP credentials in cleartext.
To configure the LDAPS service with HiveServer2:
- Import the LDAP server CA certificate or the server certificate into a truststore on the HiveServer2 host. If you import the CA certificate, HiveServer2 will trust any server with a certificate issued by the LDAP server's CA. If you only import the server certificate, HiveServer2 trusts only that server. See Creating Java Keystores and Truststores for more details.
- Make sure the truststore file is readable by the hive user.
- Set the hive.server2.authentication.ldap.url configuration property in hive-site.xml to the LDAPS URL. For example, ldaps://sample.myhost.com.
If this is a managed cluster, in Cloudera Manager, go to the Hive service and select Configuration. Under Category, select Security. In the right panel, search for HiveServer2 TLS/SSL Certificate Trust Store File, and add the path to the truststore file that you created in step 1.If you are using an unmanaged cluster, set the environment variable HADOOP_OPTS as follows:
- Restart HiveServer2.
Pluggable authentication allows you to provide a custom authentication provider for HiveServer2.
To enable pluggable authentication:
- Set the following properties in /etc/hive/conf/hive-site.xml:
<property> <name>hive.server2.authentication</name> <value>CUSTOM</value> <description>Client authentication types. NONE: no authentication check LDAP: LDAP/AD based authentication KERBEROS: Kerberos/GSSAPI authentication CUSTOM: Custom authentication provider (Use with property hive.server2.custom.authentication.class) </description> </property> <property> <name>hive.server2.custom.authentication.class</name> <value>pluggable-auth-class-name</value> <description> Custom authentication class. Used when property 'hive.server2.authentication' is set to 'CUSTOM'. Provided class must be a proper implementation of the interface org.apache.hive.service.auth.PasswdAuthenticationProvider. HiveServer2 will call its Authenticate(user, passed) method to authenticate requests. The implementation may optionally extend the Hadoop's org.apache.hadoop.conf.Configured class to grab Hive's Configuration object. </description> </property>
- Make the class available in the CLASSPATH of HiveServer2.
Trusted Delegation with HiveServer2
HiveServer2 determines the identity of the connecting user from the underlying authentication subsystem (Kerberos or LDAP). Any new session started for this connection runs on behalf of this connecting user. If the server is configured to proxy the user at the Hadoop level, then all MapReduce jobs and HDFS accesses will be performed with the identity of the connecting user. If Apache Sentry is configured, then this connecting userid can also be used to verify access rights to underlying tables and views.
In CDH 4.5, a connecting user (for example, hue) with Hadoop-level superuser privileges, can request an alternate user for the given session. HiveServer2 will check if the connecting user has Hadoop-level privileges to proxy the requested userid (for example, bob). If it does, then the new session will be run on behalf of the alternate user, bob, requested by connecting user, hue.
# Login as super user Hue kinit hue -k -t hue.keytab hue@MY-REALM.COM # Connect using following JDBC connection string # jdbc:hive2://myHost.myOrg.com:10000/default;principal=hive/_HOST@MY-REALM.COM;hive.server2.proxy.user=bob
Impersonation in HiveServer2 allows users to execute queries and access HDFS files as the connected user rather than the super user who started the HiveServer2 daemon. This enforces an access control policy at the file level using HDFS file permissions or ACLs. Keeping impersonation enabled means Sentry does not have end-to-end control over the authorization process. While Sentry can enforce access control policies on tables and views in the Hive warehouse, it has no control over permissions on the underlying table files in HDFS. Hence, even if users do not have the Sentry privileges required to access a table in the warehouse, as long as they have permission to access the corresponding table file in HDFS, any jobs or queries submitted will bypass Sentry authorization checks and execute successfully.
To configure Sentry correctly, restrict ownership of the Hive warehouse to hive:hive and disable Hive impersonation as described here.
To enable impersonation in HiveServer2:
- Add the following property to the /etc/hive/conf/hive-site.xml file and set the value to true. (The default value is
<property> <name>hive.server2.enable.impersonation</name> <description>Enable user impersonation for HiveServer2</description> <value>true</value> </property>
- In HDFS or MapReduce configurations, add the following property to the core-site.xml file:
<property> <name>hadoop.proxyuser.hive.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.hive.groups</name> <value>*</value> </property>
See also File System Permissions.
Securing the Hive Metastore
To prevent users from accessing the Hive metastore and the Hive metastore database using any method other than through HiveServer2, the following actions are recommended:
- Add a firewall rule on the metastore service host to allow access to the metastore port only from the HiveServer2 host. You can do this using iptables.
Grant access to the metastore database only from the metastore service host. This is specified for MySQL as:
GRANT ALL PRIVILEGES ON metastore.* TO 'hive'@'metastorehost';where metastorehost is the host where the metastore service is running.
- Make sure users who are not admins cannot log on to the host on which HiveServer2 runs.
Disabling the Hive Security Configuration
Hive's security related metadata is stored in the configuration file hive-site.xml. The following sections describe how to disable security for the Hive service.
Disable Client/Server AuthenticationTo disable client/server authentication, set hive.server2.authentication to NONE. For example,
<property> <name>hive.server2.authentication</name> <value>NONE</value> <description> Client authentication types. NONE: no authentication check LDAP: LDAP/AD based authentication KERBEROS: Kerberos/GSSAPI authentication CUSTOM: Custom authentication provider (Use with property hive.server2.custom.authentication.class) </description> </property>
Disable Hive Metastore securityTo disable Hive Metastore security, perform the following steps:
- Set the hive.metastore.sasl.enabled property to false in all configurations, the metastore service side as well as for all clients of the metastore. For example, these might include HiveServer2, Impala, Pig and so on.
- Remove or comment the following parameters in hive-site.xml for the metastore service. Note that this is a server-only change.