Cloudera Navigator Metadata Architecture

Cloudera Navigator metadata features provide data discovery and data lineage functions. The Cloudera Navigator metadata architecture is illustrated below.



The Navigator Metadata Server performs the following functions:
  • Obtains connection information about CDH services from the Cloudera Manager Server
  • Extracts metadata for the entities managed by those services at periodic intervals
  • Manages and applies metadata extraction policies during metadata extraction
  • Indexes and stores entity metadata
  • Manages authorization data for Navigator users
  • Manages audit report metadata
  • Generates metadata and audit analytics
  • Implements the Navigator UI and API

The Navigator Metadata database stores entity metadata, policies, user authorization and audit report metadata, and analytic data.

The Cloudera Navigator Metadata Server manages metadata about the entities in a CDH cluster and relations between the entities. The metadata schema defines the types of metadata that are available for each entity type it supports.

The types of metadata defined by the Navigator Metadata component include: the name of an entity, the service that manages or uses the entity, type, path to the entity, date and time of creation, access, and modification, size, owner, purpose, and relations—parent-child, data flow, and instance of—between entities. For example, the following shows the property sheet of a file entity:



For all entities, as shown in the Details tab, there are two classes of metadata:
  • technical metadata - metadata defined when entities are extracted. You cannot modify technical metadata.
  • custom metadata - metadata added to extracted entities. You can add and modify custom metadata before and after entities are extracted.
In addition, for Hive entities, there are extended attributes.