Defining Properties for Managed Metadata

Required Role: Metadata Administrator (or Full Administrator)

You can use managed metadata to add typed metadata to classes of entities. You can add namespaces and properties.

A namespace is a container for properties. Four namespaces are reserved:
  • nav for Navigator metadata classes (for example, fselement and user-defined custom fields)
  • up (custom metadata)
  • tp (technical properties)
  • xt (partner applications)

The combination of namespace and property name must be unique.

A property can be one of the following types:
  • Boolean
  • date
  • integer
  • long
  • float
  • double
  • text (with optional maximum length and regular expression validation criteria)
  • enum (of string)
A property can be single-valued or assume multiple values.

After metadata properties have been assigned to specific entities in the cluster, the property types and values can be used as filters for Cloudera Navigator Search, to find specific entities.

Creating Custom Properties with the Cloudera Navigator Console

To create custom properties:

  1. Log in to the Cloudera Navigator console.
  2. Click the Administration link in the upper right. The Managed Metadata tab displays the list of namespaces and the properties defined in the namespaces.
  3. Click the New Property... button.
  4. In the Classes field, click the filter icon () or start typing one of the Cloudera Navigator entity built-in class names.
  5. Select the class of entities to which the property applies. To clear the field, hover over the field and click the delete icon (x) that displays at the right of the field.
  6. Click the Namespace field and select a namespace. If the Namespace drop-down list is empty, click Create Namespace....
    1. Specify a namespace name and optional description.
    2. Click Continue.
  7. Add the name for the property.
  8. Specify an optional description.
  9. Select the Multivalued Enable checkbox if the property can have more than one value. For example, an emailFrom property should accept only one value, but an emailTo property could accept more than one value.
  10. In the Type drop-down list, select the property type and specify constraints on the value.
    • Boolean - Boolean: true or false.
    • Date - Date and time.
    • Enumeration - A set of values. In the Enumeration field, type valid enumeration values and press Enter or Tab.
    • Number - A number. In the Number Type field, select the type of the number: Integer, Long, Float, Double.
    • Text - A string.
      • Maximum Length - The maximum length of the string.
      • Regular Expression - A regular expression that determines whether a string is a valid value. You can test the expression by clicking Show regex tester, entering input that you expect to match the expression, and clicking Execute test. In the following example, the expression tester indicates that test@example.com matches the defined expression.

  11. Click Continue to Review. The Review screen displays.
  12. Click Create to create the property, Cancel to return to the Properties page, or Back to Edit Property to continue editing the property.

Example Properties

The following figure shows two properties in the namespace MailAnnotation that apply to entities of the HDFS Entity class (HDFS files and directories). The emailFrom property is of type TEXT and can be assigned a single value. The MailTo property is also of type TEXT but can have multiple values.

Using Cloudera Navigator Console to Manage Properties

You can view managed metadata property summary details by clicking property name in the Properties table, or by clicking the Actions box in the property row and then clicking View in the dropdown.

You can also edit some aspects of a property, delete and restore a property, and purge a deleted property.

Editing a Property

After a property is created, you can edit property data in the following ways:
  • Add classes to which the property applies
  • Add and remove enumeration values
  • Change the description
  • Change the maximum length
  • Change the regex pattern
  1. Log in to the Cloudera Navigator console using administrator credentials roles:
    • Cloudera Manager Full Administrator
    • Cloudera Manager Navigator Administrator
    • Cloudera Navigator Full Administrator
    • Cloudera Navigator Metadata Administrator
  2. Click the Administration link in the upper right. The Managed Metadata tab displays the list of namespaces and the properties defined in the namespaces.
  3. Open the property Edit page by clicking the Actions box in the property row and then clicking Edit in the dropdown.
  4. In the Additional Class field, click the or type the name of a Cloudera Navigator class entity. For example, start typing "Hive.." to see
  5. Select the class of entities to which the property applies. To clear the field, hover over the field and click the delete icon (x) that displays at the right of the field.
  6. In the Description field, add a description or edit an existing description.
  7. If the property is of the Enumeration type, you can add or remove values in the Enumeration field.
  8. For Text properties:
    • In the Maximum Length field, add or change the value for the maximum length.
    • In the Regular Expression field, edit the expression. Click Show regex tester to test input against any changes you make.
  9. Click Continue to Review. The Review screen displays.
  10. Click Update to commit the change or Back to Edit Property to continue editing the property, or Cancel to return to the Properties page.

Deleting, Restoring, and Purging Managed Metadata Properties

After a property is deleted, it cannot be assigned to entities. However, the property still exists and the entities that have already been tagged with the property retain the tag until the purge operation is run.

Deleted properties display status as Deleted in the Cloudera Navigator console (on the Managed Metadata tab of the Administration menu). For example, the EmailFrom property has been deleted:


The Status displays only to users with the Navigator Administrator or Managed & Custom Metadata Editor user roles.

The Cloudera Navigator purge process permanently removes properties and any values from all entities. Policies that assign metadata using a property that has been purged will fail the next time they are run. Because deleted properties are not removed from the system until they have been purged, the name of any deleted property cannot be re-used until after purging the system.

Deleting a Property

  1. In the Properties table, for the property that you are deleting, click the Actions button, and then click Delete in the drop-down menu.
  2. In the Delete Property dialog box, review the property deletion information. If any entities are affected, you see a View affected entities link; click to see all entities that use the property.
  3. Click Confirm Delete to delete the property, or click Cancel.

Restoring a Property

If you have not yet purged a deleted property, you can restore it.

  • In the Properties table, for the property that you are restoring, click the Actions button, and then click Restore in the drop-down menu.

Purging a Property

You can permanently remove deleted properties by purging them. All values assigned to the deleted properties are lost; however, the affected entities are not deleted. Purging permanently removes all properties marked as Deleted in the Status column.
  1. In the Properties table, click Purge Deleted Properties. The Purge all Deleted Properties dialog box opens, describing the effects of the purge and reporting the number of entities that use the property.
  2. In the Purge all Deleted Properties dialog box, click Confirm Purge to permanently remove all deleted properties, or click Cancel to return to the Properties page.

Navigator Built-in Classes

Class Description
HDFS Dataset Logical dataset backed by a path in HDFS.
HDFS Dataset Field Field in an HDFS dataset.
HDFS Entity HDFS file or directory.
Hive Column Column in a Hive table.
Hive Database Hive database.
Hive Partition Partition of a Hive table.
Hive Query Hive query template.
Hive Query Execution Instance of a Hive query.
Hive Query Part Component of a Hive query that maps specific input columns to output columns.
Hive Table A Hive table.
Hive View View on one or more Hive tables.
Impala Query Impala query template.
Impala Query Execution Instance of an Impala query.
Impala Query Part Component of an Impala query that maps specific input columns to output columns.
Job Instance Instance of a MapReduce, YARN, or Spark job.
Job Template Template for a MapReduce, YARN, or Spark job.
Oozie Workflow Template for an Oozie workflow.
Oozie Workflow Instance Instance of an Oozie workflow.
Pig Field Field for a relation in Pig; similar to a column in a Hive table.
Pig Operation Template for a Pig transformation.
Pig Operation Execution Instance of a Pig transformation.
Pig Relation Pig relation; similar to a Hive table.
S3 Bucket A bucket in S3.
S3 Object A file or directory in an S3 bucket.
Sqoop Export Sub-operation Sqoop export component that connects specific columns.
Sqoop Import Query Sqoop import job with query options.
Sqoop Import Sub-operation Sqoop import component that connects specific columns.
Sqoop Operation Execution Instance of a Sqoop job.
Sqoop Table Export Sqoop table export operation template.
Sqoop Table Import Sqoop table import operation template.
User Sub-operation User-specified sub-operation of a MapReduce or YARN job; used for specifying custom column-level lineage.

Defining Metadata with the Navigator API and Navigator SDK

In addition to accessing and defining metadata with the Cloudera Navigator console, you can also use the Cloudera Navigator API and the Navigator SDK.

For information on the Navigator API, see Cloudera Navigator APIs.

The Navigator SDK is a client library that can be used to extract metadata directly from Navigator Metadata Server or to enrich metadata with custom metadata models, entities, and relationships. The SDK is located in Github in the cloudera/navigator-sdk repository.