Using User-Defined Functions (UDFs) with HiveServer2

Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)

Perform one of the following procedures, depending on whether you have enabled Sentry, to use custom user-defined functions (UDFs) with Hive.

Without Sentry Enabled

  1. Copy the UDF JAR files to HDFS. Save the JARs to any directory you choose, and make a note of the path.
  2. You should now be able to run the CREATE FUNCTION command and point to the JAR from Hive. For example:
    CREATE FUNCTION addfunc AS 'com.example.hiveserver2.udf.add' USING JAR 'hdfs:///tmp/add.jar'

With Sentry Enabled

  1. Copy the UDF JAR file to HDFS.
  2. Copy the JAR file to the host on which HiveServer2 is running. Save the JARs to any directory you choose, and make a note of the path.
  3. In the Cloudera Manager Admin Console, go to the Hive service.
  4. Click the Configuration tab.
  5. Expand the Service-Wide > Advanced categories
  6. Configure the Hive Auxiliary JARs Directory property with the HiveServer2 host path from the Step 2.
  7. Click Save Changes. The JARs are added to HIVE_AUX_JARS_PATH environment variable.
  8. Redeploy the Hive client configuration.
    1. In the Cloudera Manager Admin Console, go to the Hive service.
    2. From the Actions menu at the top right of the service page, select Deploy Client Configuration.
    3. Click Deploy Client Configuration.
  9. Restart the Hive service. If the Hive Auxiliary JARs Directory property is configured but the directory does not exist, HiveServer2 will not start.
  10. Grant privileges on the JAR files to the roles that require access. You can use the Hive SQL GRANT statement to do so. For example, to grant privileges on the add.jar file:
    GRANT ALL ON URI 'hdfs:///tmp/add.jar' TO ROLE EXAMPLE_ROLE
  11. Run the CREATE FUNCTION command and point to the JAR from Hive. For example:
    CREATE FUNCTION addfunc AS 'com.example.hiveserver2.udf.add' USING JAR 'hdfs:///tmp/add.jar'