Known Issues and Workarounds in Cloudera Navigator Data Management

Access, Login, or Cloudera Navigator Console Issues

Internet Explorer 11 document mode issue

When using Microsoft Internet Explorer 11 to access the Cloudera Navigator console, the login page does not display properly due to an issue with that browser's Compatibility View and certain document modes (see the Microsoft Technet article for details). As of Cloudera Navigator 2.10.1, a warning prompt now displays if the Internet Explorer's document mode is set to IE7 or lower, to alert you to the issue, but prior releases display a blank page rather than the login page.

Workaround: Open the Compatibility View Settings (under the Internet Explorer options menu) and remove the setting for the domain or website. Or use another browser to access the Cloudera Navigator console. Supported browsers include Mozilla Firefox, Google Chrome, or Microsoft Edge.

Fixed in versions: Cloudera Manager 5.11.1 (2.10.1)

Cloudera Bugs: NAV-3743, NAV-4381

Error message ("not authorized") when logging in

The Cloudera Navigator console saves the state of the last URL accessed when you log out, and opens the same page at your next login. If two different users log in to Navigator using the same browser tab, the state of the first user applies to the second. If the second user does not have permissions to that section of the page, that user receives an error message.

Workaround: Close the browser tab and log in on a new tab. The state is cleared, and the access error message does not appear.

Fixed in versions: Cloudera Manager 5.11 (2.10)

Cloudera Manager Configuration Issues

Overriding safety valve settings disables audit and lineage features

Customers or third party applications such as Unravel may require that is configured in a HiveServer2 safety valve. Cloudera Manager will comment out the value that is configured if audit or lineage is enabled for Hive. The safety valve content shows the commented code:

<!--'', originally set to
(non-final), is overridden below by a safety valve-->

This automated change disables Navigator's auditing and lineage features without notification.

To work around this problem, edit the HiveServer2 safety valve to enable the Navigator entries.

Cloudera Bug: NAV-5331

Multi-Cluster Environments

Cloudera Navigator is not supported in installations in multi-cluster environments for Cloudera Manager versions 5.12.0, 5.12.1, and 5.13.0

Navigator is not able to uniquely identify more than one cluster in installations of Cloudera Manager that support multiple clusters. This problem only occurs for Cloudera Manager deployments of version 5.12.0, 5.12.1, or 5.13.0. Avoid multi-cluster installations with these releases. This problem does not affect single cluster deployments or Altus clusters.

Fixed in versions: Cloudera Manager 5.14.0 (2.13.0), Cloudera Manager 5.13.1 (2.12.1), Cloudera Manager 5.12.2 (2.11.2)

Cloudera Bug: NAV-5615

Table-to-HDFS links not established when Navigator supports multiple, high-availability clusters

When Navigator extracts metadata for multiple clusters and when the clusters are configured for high availability operation, Navigator does not correctly link tables to their HDFS backing data. The result is that lineage between Hive tables and their physical data are not created. In addition, some Hive table metadata that is derived from the backing files is not available.

You may see errors in the log such as the following:

2018-02-19 09:01:32,999 ERROR com.cloudera.nav.persist.impl.CompositeLinker [CDHExecutor-0-CDHUrlClassLoader@01010d7e]: Internal error while expected one element but was: <com.cloudera.nav.core.model.Source@b22e68ba, com.cloudera.nav.core.model.Source@5b1bac2c>

Fixed in versions: Cloudera Manager 5.14.0 (2.13.0)

Cloudera Bug: NAV-5749

AWS and Amazon S3

Update AWS Credentials from the same AWS account

After configuring Cloudera Navigator with a specific set of AWS Credentials for Amazon S3, any future changes to the credentials (for example, if you rotate credentials on a regular basis) must be for the same AWS account (IAM user). Changing the AWS Credentials to those of a different IAM user results in errors from the Amazon Simple Queue Service (used transparently by Cloudera Navigator).

Workaround: If a new key is provided to Cloudera Navigator, the key must belong to the same AWS account as the prior key.

Cloudera Bugs: NAV-3990, NAV-3990

Unnamed folders on Amazon S3 not extracted

Unnamed folders on Amazon S3 are not extracted by Navigator, but the content of the folders is extracted. For example, a top-level folder the top level folder in the bucket has no name (for example, /bucket//folder/file), it is extracted as /bucket/folder/file.

Cloudera Bug: NAV-3981

Implicit folders are not marked as deleted

If an implicit folder is deleted in Amazon S3, it does not appear as deleted in Cloudera Navigator console.

Workaround: To prevent folders deleted from Amazon S3 from appearing in Navigator Search results, include implicit:false in the search query.

Cloudera Bug: NAV-3802

Extraction delayed while Amazon S3 inconsistent

Inconsistencies that occur in AWS (for example, due to eventual consistency) can delay Navigator extraction of metadata and lineage from Amazon S3. When Cloudera Navigator detects an inconsistency, extraction may stop until the inconsistency is resolved in AWS. Cloudera Navigator will retry at the next scheduled extraction.

Cloudera Bug: NAV-4028

Microsoft Azure

Less Secure Credentials Protection Policy can expose Azure credentials in audit logs

When you use Cloudera Manager to configure the ADLS Connector service using the Less Secure option for the Credentials Protection Policy, it is possible for Hive audit logs to include Microsoft Azure credentials. If you are using Navigator Audit Server, these credentials may appear in audit reports. To mitigate this problem, make sure that access to Hive logs is appropriately controlled and that Navigator users with Auditing Viewer roles are cleared to have access to the Hive credentials.

Cloudera Bugs: NAV-5861, CDH-56241

Data Stewardship Dashboard

Counts displayed in Dashboard and Search may differ

Counts for databases, tables, views, and other entities displayed in Cloudera Navigator Dashboard can differ from Search values for the same objects. The Dashboard displays data from the Navigator Audit Server, which contains actual counts captured continuously. On the other hand, Search (Solr) returns data that has been periodically extracted from HMS (Hive Metadata Server). Tables, databases, views, or other objects created or destroyed between extracts are not reflected in values displayed by Search (Solr), but they are contained in the audit data displayed in the Dashboard.

Fixed in version:Cloudera Manager 5.11 (2.10)

Cloudera Bug: NAV-4192

Tables Populated count reflects INSERT and UPDATE statements

The Tables Populated display widget in the Data Stewardship Dashboard reflects the number of times that a table has been loaded with data, such as through INSERT and UPDATE statements—not the number of unique tables loaded. For example, a single table to which data is added (through 6 INSERT statements) and that has also had 4 UPDATE statements submitted in the same period would report Tables Populated as 10.

Cloudera Bug: NAV-3886

Hive, Hue, Impala

Hive service configuration in auditing component

For Hive services, the auditing component does not support the "Shutdown" option for the "Queue Policy" property.

Severity: Low

Workaround: None.

Cloudera Bug: OPSAPS-11537

Hue service audit log and Unknown IP address

The IP address in a Hue service audit log displays as "unknown".

Severity: Low

Workaround: None.

Cloudera Bug: OPSAPS-11986

Changing auditing configuration for Hive or Hue requires restart

Whenever a change is made to a Hive service's audit configuration, Beeswax must be restarted so that the Hue service audit log can reflect the change.

Severity: Low

Workaround: None.

Cloudera Bug: OPSAPS-12274

Viewing Navigator tags in Hue overloads Metadata Server heap

When viewing Cloudera Navigator tags through Hue, Navigator uses more memory than usual and does not release the memory after logging out of Hue. Eventually, the calls between Hue and Navigator will occupy the majority of the heap space allocated to Navigator Metadata Server.

To work around this problem you may need to restart the Navigator Metadata Server periodically to clear the heap usage.

Cloudera Bug: NAV-4326

Navigator Metadata Server

Navigator Embedded Solr can reach its limit on number of documents it can store

Navigator Metadata Server extracts HDFS entities by performing a one-time bulk extraction and then switching to incremental extraction. In Cloudera Manager releases 5.10.0, 5.10.1 and 5.11.0 (Navigator releases 2.9.0, 2.9.1, and 2.10.0), a problem causes HDFS bulk extraction to be run more than one time, resulting in duplicate relations created for HDFS. Over time, embedded Solr runs out of document IDs that it can assign to new relations and fails with following error:

"Caused by: java.lang.IllegalArgumentException: Too many documents, composite IndexReaders cannot exceed 2147483519"       

When this happens, Navigator stops any more extraction of data as no new documents can be added to Solr.

The workaround is to upgrade to Cloudera Manager releases 5.10.2, 5.11.1, or 5.12.x (Navigator 2.9.2, 2.10.1, 2.11.x) or later where this issue is fixed. In addition, see Repairing metadata in the storage directory after upgrading.

Fixed in versions: Cloudera Manager 5.12.x (2.11.x), Cloudera Manager 5.11.1 (2.10.1), Cloudera Manager 5.10.2 (2.9.2)

Cloudera Bug: NAV-5600

Lineage relations are missing for some views with "Endpoint2 must not be null or empty" error

Lineage relations are missing for some views and the Navigator Metadata Server log includes errors with a stack trace that looks something like this:

2017-10-25 19:00:40,780 ERROR com.cloudera.nav.persist.impl.CompositeLinker [CDHExecutor-0-CDHUrlClassLoader@381e3]: Internal error while linking.
java.lang.IllegalStateException: EndPoint2 must not be null or empty

Note that the exception is “EndPoint2 must not be null or empty”. If you see errors that don’t exactly match this exception, see one of the following similar errors:

While Navigator is determining lineage relations, it found a relation that was created with the target entity missing, hence Endpoint2 is null. This situation can occur when a Hive view is altered without changing the columns extracted from the source table, such as when the name or other table metadata on the view is changed. For example, the following operation would produce the error:

alter view customers_sales_sw set tblproperties("region"="southwest");

This problem can occur in Navigator Metadata Server versions 2.9.0, 2.9.1, 2.9.2, 2.10.0, 2.10.1, 2.10.2, 2.11.0, 2.11.1, 2.12.0, and 2.12.1. These versions correspond to the following Cloudera Manager versions: 5.10.0, 5.10.1, 5.10.2, 5.11.0, 5.11.1, 5.11.2, 5.12.0, and 5.12.1.

Eventually you'll want to upgrade away from an affected Navigator version. If you aren't able to upgrade right away, you can ignore this error with the consequences that your logs are noisy and fill up more quickly. See the Knowledge Base article "EndPoint2 must not be null or empty" for more information.

Cloudera Bug: NAV-5661, KB-713

Log includes the error "EndPoint1 must not be null"

The following error may appear in the Navigator Metadata Server log:

2017-10-17 13:00:23,007 ERROR com.cloudera.nav.hive.extractor.AbstractHiveExtractor [CDHExecutor-0-CDHUrlClassLoader@14784b7b]: Unable to parse hive view query *: EndPoint1 must not be null or empty
java.lang.IllegalStateException: EndPoint1 must not be null or empty

This error occurs because the Hive pull extraction for creating a Hive view produces an incorrect lineage relationship for the Hive view. However, Navigator also receives information for the view creation through the push extractor, which correctly produces the lineage relation. You can safely ignore this error. Note that it is distinct from Lineage relations are missing for some views with "Endpoint2 must not be null or empty" error.

Cloudera Bug: NAV-4224

Log includes the error "EndPoint2 EntityType must not be null unless unlinked"

The following error may appear in the Navigator Metadata Server log:

2017-10-26 15:48:27,440 INFO com.cloudera.nav.hdfs.extractor.HdfsOperationHandler [CDHExecutor-0-CDHUrlClassLoader@3eca0c78]: Unable to process rename for path /hbase/data/default/img/f6572262455416bff42f92fd2b0e75c0/.tmp/1cc1af7b0f784572
b821980f6b4c5adc: can't find source information.
2017-10-26 15:48:27,440 ERROR com.cloudera.nav.hdfs.client.InotifyClient [CDHExecutor-0-CDHUrlClassLoader@3eca0c78]: Error handling event (txid: 2020059072): Renamed /hbase/data/default/img/f6572262455416bff42f92fd2b0e75c0/.tmp/1cc1af7b0f
784572b821980f6b4c5adc to /hbase/data/default/img/f6572262455416bff42f92fd2b0e75c0/D/1cc1af7b0f784572b821980f6b4c5adc at time 1509049550134
java.lang.IllegalStateException: EndPoint2 EntityType must not be null unless unlinked
   at com.cloudera.nav.core.model.Relation.validate(

This error occurs when HDFS files are renamed and the new name is affected by an exclusion filter. The only consequence of the problem is noise in the log. Note that it is distinct from Lineage relations are missing for some views with "Endpoint2 must not be null or empty" error.

This error is fixed in Navigator version 2.13 and later.

If you are not able to upgrade, you can ignore the error.

Cloudera Bug: NAV-4654

Purging Deleted Properties causes error viewing entity details

After upgrading to Navigator 2.11.0 (Cloudera Manager 5.12.0) and later versions (to Navigator 2.13.0), removing deleted managed metadata properties with the "Purge deleted Properties" command causes a mismatch between the Solr schema and the data stored for entities. (Administration > Managed Metadata).

The result is that the Navigator console shows the error "Bad Request" when displaying details for entities affected by the mismatch.

When this problem occurs, the Navigator Metadata Server log will include a message such as the following:

[CDHExecutor-0-CDHUrlClassLoader@62d310e8]: Internal error while linking. java.lang.IllegalArgumentException: java.lang.ClassCastException@2f145bbc          

To avoid this problem, do not purge deleted properties; allow these properties to stay in the list of properties.

Cloudera Bug: NAV-5982

Metadata Server log file and spurious error messages

Certain configurations of OS and database (such as PostgreSQL and Ubuntu Linux) may raise spurious error messages regarding non-existent files. Such messages can be safely disregarded, such as this example:
Error: [main]: PWC6351: In TLD scanning, the
supplied resource file:/usr/share/java/oracle-connector-java.jar does not

Fixed in versions: Cloudera Manager 5.10.0 (2.9.0)

Cloudera Bug: NAV-698


The flag for disabling purge is ignored in the UI

The Administration > Purge Settings > page may show scheduled purge operations even if the nav.purge.enabled property is set to false. The displayed purge operations will not run.

Policy specifications and cluster names affect purge

Policies cannot use cluster names in queries. Cluster name is a derived attribute and cannot be used as-is.

Workaround: When setting move actions for Cloudera Navigator, if there is only one cluster known to the Navigator instance, remove the clusterName clause.

If there is more than one cluster known to the Navigator instance, replace clusterName with sourceId. To get the sourceId, issue a query in this format:
curl '<nav-url>/api/v9/entities/?query=type%3Asource&limit=100&offset=0'
Use the identity of the matching HDFS service for this cluster as the sourceId.

Cloudera Bug: NAV-3288

Purge appears suspended while extraction is running

When you issue a purge command, it does not start if extractors are running. During this time, the maintenance page indicates that maintenance tasks are not running. Once the extraction is complete and purge starts, it shows the status of the purge operations.

Fixed in versions: Cloudera Manager 5.7 (2.6)

Cloudera Bug: NAV-2793


Spark Lineage Limitations and Requirements

Spark lineage diagrams are supported in the Cloudera Manager 5.11/Cloudera Navigator 2.10 release. Spark lineage is supported for Spark 1.6 only, not Spark 2.0 or 2.1. Lineage is not available for Spark when Cloudera Manager is running in single user mode. In addition to these requirements, Spark lineage has the following limitations:
  • Lineage is produced only for data that is read/written and processed using the Dataframe and SparkSQL APIs. Lineage is not available for data that is read/written or processed using Spark's RDD APIs.
  • Lineage information is not produced for calls to aggregation functions such as groupBy().
  • The default lineage directory for Spark on Yarn is /var/log/spark/lineage. No process or user should write files to this directory—doing so can cause agent failures. In addition, changing the Spark on Yarn lineage directory has no effect: the default remains /var/log/spark/lineage.

Cloudera Bug: OPSAPS-39589

Spark extractor enabled using safety valve deprecated

The Spark extractor included prior to CDH 5.11 and enabled by setting the safety valve, nav.spark.extraction.enable=true is being deprecated, and could be removed completely in a future release. If you are upgrading to CDH 5.11 from a deployment that had configured this safety valve, be sure to remove the setting when you upgrade.

Upgrade Issues and Limitations

Issues and limitations matrix by release

Before upgrading Cloudera Navigator, review these version-specific release notes:

Upgrade Limitations, requirements, preliminary tasks
From... To...
2.10 (and lower) 2.11.0, 2.11.1, 2.12.0

When using Navigator in multi-cluster environments, avoid upgrading to Cloudera Manager deployments of version 5.12.0, 5.12.1, or 5.13.0 due to a known problem where Navigator does not recognize more than one cluster. Instead, upgrade to Cloudera Manager release 5.12.2, 5.13.1, or 5.14.0 (Navigator 2.11.2, 2.12.1, or 2.13.0). See Cloudera Navigator is not supported in installations in multi-cluster environments for Cloudera Manager versions 5.12.0, 5.12.1, and 5.13.0 for more details.

This problem does not affect single cluster deployments or Altus clusters.

2.8 (and lower) 2.9.0, 2.9.1, 2.10.0 Avoid upgrading to Cloudera Manager releases 5.10.0, 5.10.1 and 5.11.0 (Navigator releases 2.9.0, 2.9.1, and 2.10.0) due to the known problem causing the storage directory to fill beyond its capacity. The workaround is to upgrade to Cloudera Manager release 5.10.2, 5.11.1, or 5.12.x (Navigator 2.9.2, 2.10.1, 2.11.x) or later, where this issue is fixed. See Navigator Embedded Solr can reach its limit on number of documents it can store if you are already running one of these releases.
2.10 2.10.1 Upgrading Cloudera Manager 5.11.0 (Cloudera Navigator 2.10) to Cloudera Manager 5.11.1 (Cloudera Navigator 2.10.1) results in failures by Navigator Audit Server to publish to Kafka. See Publishing to Kafka fails after upgrade for details and a workaround.
2.8 (and lower) 2.9 (and higher) Upgrading to Cloudera Navigator 2.9 and higher (Cloudera Manager 5.10 and higher) can take a significant amount of time, depending on the size of the datadir. Before starting to upgrade to Cloudera Manager 5.10 (which automatically starts the upgrade to Cloudera Navigator 2.9), see Upgrading Cloudera Navigator Can be Extremely Slow. Briefly, in this release Solr indexing has been optimized to improve search speed. When the upgrade process completes and Cloudera Navigator services re-start, the Solr indexing upgrade automatically begins. No other actions can be performed until Solr indexing completes (progress message display during this process).
2.6 (and lower) Any The Cloudera Navigator Metadata Server requires an upgrade of data in the storage directory. See Upgrading Cloudera Navigator for details.
2.4.0, 2.4.1 2.4.2 After upgrading, requires manual modification to the HDFS extractor state file to change status of UNDELETE taskTypes from SUCCEEDED to FAILED. This manual process is required only for these specific releases listed. See Issues Fixed in Cloudera Navigator 2.4.2 for details.
Any 2.4 – 2.6 Cloudera Navigator 2.4 – 2.6 do not support JDK 1.6, so upgrade any instances of JDK 1.6 to JDK 1.7 or 1.8 before upgrading to these releases. See Java Development Kit Installation for details.
2.1 2.2 Cloudera Navigator 2.1 used the beta version of the Navigator Metadata Server policy engine. Policies created using that version are not retained during the upgrade.
2.0 (and lower) 2.1 (and higher) The upgrade wizard for this upgrade path adds the Navigator Metadata Server to the Cloudera Manager cluster. The Navigator Metadata Server is new, and is not the same as the existing Navigator Audit Server database.
1.2 2.0 Cloudera Navigator 1.2 and Cloudera Navigator 2.0 have reached EOL (end of life) status and are no longer supported. There is no wizard for this upgrade path. Cloudera Navigator 2.0 required a clean install. The Navigator 1.2 Navigator Metadata Server was a beta release included with Cloudera Manager 5.0. The 1.2 version of the Navigator Metadata Server role must be removed before the cluster can be upgraded to Cloudera Navigator 2.0 (the roles are not compatible).