Known Issues and Workarounds in Cloudera Navigator Data Management
The sections below provide information about current known issues in Cloudera Navigator data management component.
- Access or Login
- Audit Server
- AWS and Amazon S3
- Hive, Hue
- Navigator Metadata Server
- Upgrade Limitation (Navigator 1.2 to Navigator 2)
Access or Login
Internet Explorer 11 Document mode issue
When using Microsoft Internet Explorer 11 to access the Cloudera Navigator console, the login page does not display properly due to an issue with that browser's Compatibility View and certain document modes (see the Microsoft Technet article for details). As of Cloudera Navigator 2.10.1, a warning prompt now displays if the Internet Explorer's document mode is set to IE7 or lower, to alert you to the issue, but prior releases display a blank page rather than the login page.
Workaround: Open the Compatibility View Settings (under the Internet Explorer options menu) and remove the setting for the domain or website. Or use another browser to access the Cloudera Navigator console. Supported browsers include Mozilla Firefox, Google Chrome, or Microsoft Edge.
Error message ("not authorized") when logging in
The Cloudera Navigator console saves the state of the last URL accessed when you log out, and opens the same page at your next login. If two different users log in to Navigator using the same browser tab, the state of the first user applies to the second. If the second user does not have permissions to that section of the page, that user receives an error message.
Workaround: Close the browser tab and log in on a new tab. The state is cleared, and the access error message does not appear.
Audit logs and audit process drained when audited process is stopped
If an audited role is deleted or migrated to a different host, and there are pending audits that are waiting to be transferred to Audit Server, those audits may not get transferred. There are pending audits when audits cannot be transferred either because Audit Server is down or is unreachable because of a network issue. During role migration, ensure that Audit Server is in a healthy state to make sure all audited actions make to Audit Server.
Audit CSV has extra columns and is missing some data
Audit details exported as CSV files do not contain Sentry data. In addition, some of the column names display twice (Operation Text, Database Name, Object Type, for example), although the actual details display only of the duplicate columns.
Workaround: Export audits to JSON format to obtain Sentry data.
AWS and Amazon S3
Navigator requires stable AWS credentials for accessing S3
Once you set up access to Amazon S3 for Navigator with a particular set of AWS credentials, Navigator requires that any future changes to AWS credentials be from the same AWS account, to avoid raising errors from the Amazon SQS (Simple Queue Service, which is used transparently by Navigator.) Although you can change AWS account credentials in the Cloudera Manager Admin Console, trying to use different credentials with Navigator raises errors and does not succeed.
Workaround: If a new key is provided to Navigator in Cloudera Manager, the key must belong to the same AWS account as the prior key.
Unnamed folders on Amazon S3 not extracted
Unnamed folders on Amazon S3 are not extracted by Navigator, but the content of the folders is extracted. For example, a top-level folder the top level folder in the bucket has no name (for example, /bucket//folder/file), it is extracted as /bucket/folder/file.
Implicit folders are not marked as deleted in Navigator
If an implicit folder is deleted in S3, it does not appear as deleted in Navigator.
Workaround: To prevent folders deleted in S3 from appearing in Navigator Search results, include implicit:false in the search query.
Inconsistencies in AWS can cause Navigator extraction to stop
Inconsistencies that occur in AWS (for example, due to eventual consistency) can delay Navigator extraction of S3 data. When Navigator detects an inconsistency, extraction may stop until the inconsistency is resolved in AWS. Navigator will retry at the next scheduled extraction.
Counts displayed in Dashboard and Search may differ
Counts for databases, tables, views, and other entities displayed in Cloudera Navigator Dashboard can differ from Search values for the same objects. The Dashboard displays data from the Audit Server, which contains actual counts captured continuously. On the other hand, Search (Solr) returns data that has been periodically extracted from HMS (Hive Metadata Server). Tables, databases, views, or other objects created or destroyed between extracts are not reflected in values displayed by Search (Solr), but they are contained in the audit data displayed in the Dashboard.
Tables Populated count reflects INSERT and UPDATE statements
The Tables Populated display widget in the Data Stewardship Dashboard reflects the number of times that a table has been loaded with data, such as through INSERT and UPDATE statements—not the number of unique tables loaded. For example, a single table to which data is added (through 6 INSERT statements) and that has also had 4 UPDATE statements submitted in the same period would report Tables Populated as 10.
Hive extractor limitations
The Hive extractor does not support all Hive statements, specifically these:
- Table-generating functions
- Lateral views
- Transform clauses
- Regular expressions (regex) in SELECT clause
Hive queries that include any of the above will prevent lineage diagrams from completing successfully.
Hive service configuration in auditing component
For Hive services, the auditing component does not support the "Shutdown" option for the "Queue Policy" property.
Hue service audit log and Unknown IP address
The IP address in a Hue service audit log displays as "unknown".
Hive and Hue services audit re-configurations require restart
Whenever a change is made to a Hive service's audit configuration, Beeswax must be restarted so that the Hue service audit log can reflect the change.
Navigator Metadata Server
Metadata Server log file and spurious error messages
Error: [main]: PWC6351: In TLD scanning, the supplied resource file:/usr/share/java/oracle-connector-java.jar does not exist.
Policy specifications and cluster names affect purge
Policies cannot use cluster names in queries. Cluster name is a derived attribute and cannot be used as-is.
Workaround: When setting move actions for Cloudera Navigator, if there is only one cluster known to the Navigator instance, remove the clusterName clause.
curl '<nav-url>/api/v9/entities/?query=type%3Asource&limit=100&offset=0'Use the identity of the matching HDFS service for this cluster as the sourceId.
Purge appears suspended while extraction is running
When you issue a purge command, it does not start if extractors are running. During this time, the maintenance page indicates that maintenance tasks are not running. Once the extraction is complete and purge starts, it shows the status of the purge operations.
Upgrade Limitation (Navigator 1.2 to Navigator 2)
- Delete the Navigator Metadata Server role.
- Remove the content of the Navigator Metadata Server storage directory.
- Add the Navigator Metadata Server role as detailed in Adding the Navigator Metadata Server.
- Clear the cache for any browsers that connected to the 1.2 release of Navigator Metadata Server, to avoid errors caused by stale data.
Spark Lineage Limitations and Requirements
- Lineage is produced only for data that is read/written and processed using the Dataframe and SparkSQL APIs. Lineage is not available for data that is read/written or processed using Spark's RDD APIs.
- Lineage information is not produced for calls to aggregation functions such as groupBy().
- The default lineage directory for Spark on Yarn is /var/log/spark/lineage. No process or user should write files to this directory—doing so can cause agent failures. In addition, changing the Spark on Yarn lineage directory has no effect: the default remains /var/log/spark/lineage.
Spark extractor enabled using safety valve deprecated
The Spark extractor included prior to CDH 5.11 and enabled by setting the safety valve, nav.spark.extraction.enable=true is being deprecated, and could be removed completely in a future release. If you are upgrading to CDH 5.11 from a deployment that had configured this safety valve, be sure to remove the setting when you upgrade.