Cloudera Navigator Administration Tasks

The Administration tab of the Cloudera Navigator console is the starting point for several configuration and maintenance tasks.

Maintaining Metadata Store Using Purge

The volume of metadata maintained by Navigator Metadata Server can grow quickly and exceed the capacity of the Solr instance that processes the index and interfere with search speed and data lineage. For example, with stale metadata—properties no longer used to tag metadata in the system—lineage may take too long to display, or may show relationships that no longer exist.

For faster search and cleaner lineage tracing, Cloudera Navigator's Purge function prunes the system of metadata that has been deleted and that is aged beyond utility. Purge before upgrading Cloudera Navigator to a new release also can also hasten the upgrade and guard against memory errors. See Avoiding Out-of-Memory Errors During an Upgrade for details.

The Purge function can be used in a few different ways:

Scheduling the Purge Process

Use the Cloudera Navigator console to configure a schedule for a regular weekly Purge of deleted and stale metadata from your Cloudera Navigator instance, specifically, the Navigator Metadata Server and its associated database.

To configure Purge schedule:
  1. Log in to the Cloudera Navigator console using an account with privileges as either Cloudera Manager Full Administrator or Navigator Administrator. The URL to access the Cloudera Navigator console directly (rather than from within Cloudera Manager) using the default port on the host running the Navigator Metadata Server role would be as follows:
    http://fqdn-1.example.com:7187/login.html
  2. Enter your administrator user account and password at the login page.
  3. Click the Purge Settings tab. The current Metadata and Lineage purge schedule displays, along with lists of up to five upcoming scheduled purges and a list of up to five most recent completed purges.

To change the existing schedule:
  1. Click the Edit button.
  2. Set the day, time, maximum purge duration, and time frame to hold on to deleted entities (Purge entities deleted more than*) settings best for your environment. See the descriptions and usage notes for these settings in the table below.
    Property Range of selectable values Usage note
    How often Weekly Not configurable. The Purge runs weekly per your specifications for Day and Time.
    Day Days of week, Sunday through Saturday Select a day for the purge that will have minimal impact to your user community.
    Time Hourly time, from 12 Midnight through 11 PM Select a time that will have minimal impact.
    Maximum purge duration 10 minutes, 1 hour though 10 hours, 12 hours, 14 hours, 16 hours, 18 hours, 20 hours, 22 hours, 24 hours, 36 hours, 48 hours, 3 days through 7 days, inclusive Set the amount of time you want to allow for the Purge process to run. The process will not run beyond your specified duration, whether it has completed the purge or not. All entities purged even if the process is cut short by this setting remain purged. During this timeframe, no other operations in Cloudera Navigator can occur.
    Purge entities deleted more than* 1 day through 10 days, 20 days through 100 days, 150 days, 365 days Enter the number of days after entity deletion that will pass before the purge process removes it. For example, a setting of 1 day means that entities deleted before yesterday are purged but entities deleted yesterday are retained.
    Purge SELECT operations* Enable Select this option to enable Purge for Hive and Impala SELECT operations using the time period selected in the next setting (Only Purge SELECT operations older than*).
    Only Purge SELECT operations older than* 10 days through 100 days (10-day increments), 150 days, 365 days The purge will include only those SELECT operations deleted prior to older than this threshold.

If you running Hive and Impala queries on your system, you can have these purged from your system as well. Set appropriate thresholds for your use cases. Here is an example of a revised schedule: