Using Policies to Automate Metadata Tagging

Cloudera Navigator lets you automate the application of metadata to specific entity classes using the policies you define. The policy specifies the actions to be performed by Navigator Metadata Server and the conditions under which to apply them. For data stewards who want to facilitate self-service discovery in their organizations, Cloudera Navigator's metadata policy feature provides a robust mechanism for streamlining the metadata management tasks. For example, you can define policies that:
  • Add tags (custom metadata) to entities as they are ingested by the cluster
  • Move entities of specific class to a specific target path, or to the trash
  • Add managed metadata to entities
  • Run commands that take action on specific classes of entity
  • Move an entity to a target path or to trash
  • Send messages to a JMS message queue for notifications. This requires configuring the JMS server on the Cloudera Management Service. See Configuring a JMS Server for Policy Messages for details.
Messages sent to JMS queues are formatted as JSON and contain the metadata of the entity to which the policy should apply and the policy's specified message text. For example:
 {
  "entity":entity_property,
  "userMessage":"some message text"
 }

Policies are executed in the home directory of their creator and can only take actions (run commands) for which the creator has privileges. At runtime, a policy fails if the user account of the policy creator does not have privileges to perform all the commands specified in the policy.

Policy ownership can be changed by cloning the policy:

  • Log in to the Cloudera Navigator console using the account name to which you want to transfer ownership.
  • Clone the policy.
  • Log out from this account, and then log in again using account identity of original policy owner.
  • Delete or disable the original policy.

Certain actions can be specified using Java expressions. See Metadata Policy Expressions for details.

Creating Policies

Required Role: Policy Editor (or Full Administrator)

These steps begin from the Cloudera Navigator console.

  1. Click the Policies tab.
  2. Click the New Policy button.

  3. Click the Enable checkbox.
  4. Enter a meaningful name for the policy.
  5. In the Search Query field, specify the search query that defines the class of entities to which the policy should apply.
  6. Click the Test Query link to see a list of results based on the query and revise the query as needed.
  7. Enter a Policy Description to document some of the functional aspects of this policy in the context of your organization. This field is optional but it is recommended for use especially if your organization has many policies defined for use by different teams, departments, and so on.
  8. For a policy that includes Java expressions, enter the classes that the expressions will use in the Import Statements field. See Metadata Policy Expressions for details, including examples and a class reference.
  9. Choose the schedule for applying the policy:
    • On Change - When the entities matching the search string change.
    • Immediate - When the policy is created.
    • Once - At the time specified in the Start Time field.
    • Recurring - At recurring times specified by the Start and End Time fields at the interval specified in the Interval field.
    For Once and Recurring fields, specify dates and times using the calendar, time, and date tools available for the setting.
  10. Follow the appropriate procedure for the actions performed by the policy:
    • Metadata Assignments: Specify the custom metadata or managed metadata to be assigned. Optionally, you can define Java expressions for fields in the policy that support them. Check the Expression checkbox to select this capability. The following fields support expressions:
      • Name
      • Description
      • Managed Metadata
      • Key-Value Pairs
    • Command Actions: Select Add Action > Move to Trash or Add Action > Move. For a move, specify the location to move the entity to in the Target Path field. If you specify multiple actions, they are run in the order in which they are specified.

      Command actions are supported only for HDFS entities. If you configure a command action for unsupported entities, a runtime error is logged when the policy runs.

      See Viewing Command Action Status.

    • JMS Notifications: If not already configured, configure a JMS server and queue. Specify the queue name and message. Optionally, check the Expression checkbox and specify a policy expression for the message.
  11. Click Save.

Viewing Policies

Required Role: Policy Viewer (or Policy Editor, or Full Administrator)

  1. Log in to the Cloudera Navigator console.
  2. Click the Policies tab.
  3. In a policy row, click a policy name link or select Actions > View. The policy detail page is displayed.

    You can also edit, copy, or delete a policy from the policy details page by clicking the Actions button.



Enabling and Disabling Policies

As a policy administrator, you can manage access to policies by enabling and disabling them.

Required Role: Policy Editor (or Full Administrator)

  1. Log in to the Cloudera Navigator console.
  2. Click the Policies tab.
  3. In a policy row, click a policy name link or select Actions > Enable or Actions > Disable.

Copying and Editing a Policy

If you have an existing policy that you want to use as a template for another similar property, you can copy it and then make any required adjustments. You can also edit existing policies if you need to make changes to it.

Required Role: Policy Editor (or Full Administrator)

  1. Log in to the Cloudera Navigator console.
  2. Click the Policies tab.
  3. In a policy row, select Actions > Copy or Actions > Edit. You can also click the policy row and then on the policy details page, select Actions > Copy or Actions > Edit.
  4. Edit the policy name, search query, or policy actions.
  5. Click Save.

Deleting Policies

Required Role: Policy Editor (or Full Administrator)

  1. Log in to the Cloudera Navigator console.
  2. Click the Policies tab.
  3. In a policy row, select Actions > Delete and OK to confirm.