Known Issues in Cloudera Navigator 6.0.0

The following sections describe known issues in the data management components of Cloudera Navigator 6.0.0:

Authentication and Authorization

Cloudera Manager Configuration

Overriding safety valve settings disables audit and lineage features

Customers or third party applications such as Unravel may require that hive.exec.post.hooks is configured in a HiveServer2 safety valve. Cloudera Manager will comment out the hive.exec.post.hooks value that is configured if audit or lineage is enabled for Hive. The safety valve content shows the commented code:

<!--'hive.exec.post.hooks', originally set to
'com.cloudera.navigator.audit.hive.HiveExecHookContext,org.apache.hadoop.hive.ql.hooks.LineageLogger'
(non-final), is overridden below by a safety valve-->

This automated change disables Navigator's auditing and lineage features without notification.

At this time, there is no workaround.

Affected Versions: Cloudera Navigator 6.0.0 and later

Fixed Versions: N/A

Cloudera Issue: NAV-5331

Cloudera Manager audit events case-sensitive when using PostgreSQL

When Cloudera Manager and Navigator Audit Server are installed using PostgreSQL databases, the behavior of queries run from the Navigator console is different between the two databases. The result is that Cloudera Manager events are returned only if the query values match the case of the event values as they are stored in the Cloudera Manager database.

For example, the Hive operation "HiveReplicationCommand" is audited by Cloudera Manager; the audit log shows the command as HIVEREPLICATIONCOMMAND but querying with upper case fails to return the corresponding audit events. However, querying as operation = HiveReplicationCommand does return results.

Audit events other than those for Cloudera Manager are not affected.

Affected Versions: Cloudera Navigator 6.0.0, 6.0.1

Fixed Versions: 6.1.0

Cloudera Issue: NAV-6141, NAV-6795

Hive, Hue, Impala

ClassCastException in Navigator Metadata Server log

When a Hive view is created, then dropped, and then subsequently recreated as a table with the same name as that of the original view, the Hive extraction process shows this exception in Navigator Metadata Server logs:

java.lang.ClassCastException: com.cloudera.nav.hive.model.HView cannot be cast to com.cloudera.nav.hive.model.HTable

Affected Versions: Cloudera Navigator 6.0.0, 6.0.1

Fixed Versions: Cloudera Navigator 6.1.0

Cloudera Issue: NAV-5939

ConsoleAppender error in Hive Server 2 log

After running a query in Hive, the HiveServer2 stderr log includes the following error:

log4j:ERROR A "org.apache.log4j.ConsoleAppender" object is not assignable to a "com.cloudera.navigator.shaded.log4j.Appender" variable.
log4j:ERROR The class "com.cloudera.navigator.shaded.log4j.Appender" was loaded by
log4j:ERROR [sun.misc.Launcher$AppClassLoader@47d384ee] whereas object of type
log4j:ERROR "org.apache.logj4.ConsoleAppender" was loaded by [sun.misc.Launcher$AppClassLoader@47d384ee].
log4j:ERROR Could not instantiate appender named "out".
      

This problem occurs because the Navigator Audit Server plugin for Hive Server 2 expects a different log4j integration than what HiveServer 2 is using. The result is that no Navigator Audit messages appear in the Hive Server 2 logs. Hive Server 2 auditing is not affected by this problem.

Affected Versions: Cloudera Navigator 6.0.0, 6.0.1

Fixed Versions: Cloudera Navigator 6.1.0

Cloudera Issue: NAV-6523

Overriding safety valve settings disables audit and lineage features

Customers or third party applications such as Unravel may require that hive.exec.post.hooks is configured in a HiveServer2 safety valve. Cloudera Manager will comment out the hive.exec.post.hooks value that is configured if audit or lineage is enabled for Hive. The safety valve content shows the commented code:

<!--'hive.exec.post.hooks', originally set to
'com.cloudera.navigator.audit.hive.HiveExecHookContext,org.apache.hadoop.hive.ql.hooks.LineageLogger'
(non-final), is overridden below by a safety valve-->

This automated change disables Navigator's auditing and lineage features without notification.

Affected Versions: Cloudera Navigator 6.0.0 and later

Workaround: To fix this problem, manually merge the original HiveServer2 safety valve content for hive.exec.post.hooks with the new value. For example, in the case of Unravel, the new safety valve would look like the following:

<property>
  <name>hive.exec.post.hooks</name>
  <value>com.unraveldata.dataflow.hive.hook.HivePostHook,com.cloudera.navigator.audit.hive.HiveExecHookContext,org.apache.hadoop.hive.ql.hooks.LineageLogger</value>
  <description>for Unravel, from unraveldata.com</description>
</property>

Cloudera Issue: NAV-5331

Impala and Hive audit events fail to be captured when one audit event includes 4-byte characters, such as an emoji

This problem applies when the Navigator Audit Server database is a MySQL database configured to use the "UTF8" character set.

When a query includes an emoji or other Unicode supplementary-plane character that is encoded as four bytes in UTF-8, Navigator Audit Server fails to process the event and any following events from the same service.

Workaround: You can resolve this problem by configuring MySQL v5.5 or later to use the "UTF8MB4" character set. The error is described in this Stack Overflow article:

https://stackoverflow.com/questions/13653712/java-sql-sqlexception-incorrect-string-value-xf0-x9f-x91-xbd-xf0-x9f

The solution is described in the MySQL documentation topic The UTF8MB4 Character Set (4-Byte UTF-8 Unicode Encoding). Changing the character set requires restarting the MySQL server. It doesn't affect the Navigator Audit Server data.

Affected Versions: Cloudera Navigator 6.0.0, 6.0.1

Fixed Versions: 6.1.0

Cloudera Issue: NAV-4845

Viewing Navigator tags in Hue overloads Metadata Server heap

When viewing Cloudera Navigator tags through Hue, Navigator uses more memory than usual and does not release the memory after logging out of Hue. Eventually, the calls between Hue and Navigator will occupy the majority of the heap space allocated to Navigator Metadata Server.

Workaround: Restart the Navigator Metadata Server periodically to clear the heap usage.

Affected Versions: Cloudera Navigator 6.0.0 and later

Fixed Versions: N/A

Cloudera Issue: NAV-4326

Lineage not generated for Pig operations on Hive tables using HCatalog loader

When accessing a Hive table using Pig, lineage is generated in Navigator when using physical file loads, such as:

A = LOAD '/user/hive/warehouse/navigator_demo.db/salesdata';
B = LIMIT A 16;
STORE B INTO '/user/hive/warehouse/navigator_demo.db/salesdata_sample_file' using PigStorage (';');

However, when accessing the Hive table using the HCatalog load, lineage for the Pig operation is not generated when browsing the source table lineage. Such as:

A = LOAD 'navigator_demo.salesdata' using org.apache.hive.hcatalog.pig.HCatLoader();
B = LIMIT A 16;
STORE B INTO 'navigator_demo.salesdata_sample_hcatalog' using org.apache.hive.hcatalog.pig.HCatStorer();

Affected Versions: Cloudera Navigator 6.0.0 and later

Fixed Versions: N/A

Cloudera Issue: NAV-3411

HiveServer1 and Hive CLI support removed

Cloudera Navigator requires HiveServer2 for complete governance Hive queries. Cloudera Navigator does not capture audit events for queries that are run on HiveServer1/Hive CLI, and lineage is not captured for certain types of operations that are run on HiveServer1.

If you use Cloudera Navigator to capture auditing, lineage, and metadata for Hive operations, upgrade to HiveServer2 if you have not done so already.

Affected Versions: Cloudera Navigator 6.x

Fixed Versions: N/A

Cloudera Issue: TSB-185

Navigator Metadata Server

Navigator does not mark HDFS entities as deleted when in bulk extraction takes too long to complete

In large HDFS deployments, the fsimage takes a long time to index. When an HDFS checkpoint occurs it creates a new fsimage. However if the previous fsimage is still in the process of being indexed, Navigator cannot use the incremental changes found in the inotify stream because it refers to the newly created fsimage.

When this happens, Navigator attempts to start indexing the newer fsimage, creating a loop where Navigator can never take advantage of the more efficient change processing through inotify. The immediate fallout of this delay is that no HDFS entities deleted in the cluster will be marked as deleted in Navigator.

Affected Versions: Cloudera Navigator 6.0.0, 6.0.1

Fixed Versions: Cloudera Navigator 6.1.0

Cloudera Issue: NAV-6456

International characters not supported in tags or property names

Navigator tags and the key portion of user-defined and managed properties do not support UNICODE characters beyond ASCII. Only ASCII text can be used in the text of a tag or the name of a property. Property values can include international characters.

Affected Versions: Cloudera Navigator 6.0.0 and later

Fixed Versions: Tags are fixed in Cloudera Navigator 6.2.0

Cloudera Issue: NAV-7011, NAV-7044

Navigator Embedded Solr can reach its limit on number of documents it can store

Navigator Metadata Server extracts HDFS entities by performing a one-time bulk extraction and then switching to incremental extraction. In Cloudera Manager releases 5.10.0, 5.10.1 and 5.11.0 (Navigator releases 2.9.0, 2.9.1, and 2.10.0), a problem causes HDFS bulk extraction to be run more than one time, resulting in duplicate relations created for HDFS. Over time, embedded Solr runs out of document IDs that it can assign to new relations and fails with following error:

"Caused by: java.lang.IllegalArgumentException: Too many documents, composite IndexReaders cannot exceed 2147483519"       

When this happens, Navigator stops any more extraction of data as no new documents can be added to Solr.

After upgrading to this release, there is an additional recover step as described in "Repairing metadata in the storage directory after upgrading" in Troubleshooting Navigator Data Management.

Affected Versions: Versions prior to Cloudera Manager 5.10 upgraded to Cloudera Manager 5.10 or higher

Fixed Versions: N/A

Cloudera Issue: NAV-5600

Log includes the error "EndPoint1 must not be null"

The following error may appear in the Navigator Metadata Server log in systems upgraded from Cloudera Manager version 5.x:

2017-10-17 13:00:23,007 ERROR com.cloudera.nav.hive.extractor.AbstractHiveExtractor [CDHExecutor-0-CDHUrlClassLoader@14784b7b]: Unable to parse hive view query *: EndPoint1 must not be null or empty
java.lang.IllegalStateException: EndPoint1 must not be null or empty

This error occurs because the Hive pull extraction for creating a Hive view produces an incorrect lineage relationship for the Hive view. However, Navigator also receives information for the view creation through the push extractor, which correctly produces the lineage relation. You can safely ignore this error.

Affected Versions: Cloudera Navigator 6.0.0 and later

Fixed Versions: N/A

Cloudera Issue: NAV-4224

Purge

Navigator Metadata Server purge jobs may not run if there are policies configured

Navigator Metadata Server purge can produce messages such as "Checking if maintenance is running" and then fail to run during the available window. This problem occurs when a scheduled purge job waits for extraction tasks to finish but while waiting, a policy job starts, preventing the purge job from running.

Workaround: Specifically, to stop policies from running when you are trying to ensure that purge jobs will run, you can delete policies or temporarily change them to run infrequently to give the purge job time to run. When purge jobs have caught up to their backlog of work, you can change the policies back to running more frequently. Note that simply disabling the policies is not sufficient.

More generally, if you find that scheduled purge jobs are not running because there are other Navigator tasks in progress, consider stopping Navigator extractors and setting policies to run much less frequently. Then manually run the metadata purge using an API call to match your Navigator version:

curl -X POST -u user:password "https://navigator_host:7187/api/vXX/maintenance/purge?deleteTimeThresholdMinutes=duration"

API versions correspond to Navigator versions as described Mapping API Versions to Product Versions.

Affected Versions: Cloudera Navigator 6.0.0, 6.0.1, 6.1.0, 6.1.1

Fixed Versions: Cloudera Navigator 6.2.0

Cloudera Issue: NAV-7037

Policy specifications and cluster names affect purge

Policies cannot use cluster names in queries. Cluster name is a derived attribute and cannot be used as-is.

Workaround: When setting move actions for Cloudera Navigator, if there is only one cluster known to the Navigator instance, remove the clusterName clause.

If there is more than one cluster known to the Navigator instance, replace clusterName with sourceId. To get the sourceId, issue a query in this format:
curl '<nav-url>/api/v9/entities/?query=type%3Asource&limit=100&offset=0'
Use the identity of the matching HDFS service for this cluster as the sourceId.

Affected Versions: Cloudera Navigator 6.0.0 and later

Fixed Versions: N/A

Cloudera Issue: NAV-3537

Spark

Spark Lineage Limitations and Requirements

Spark lineage diagrams are supported in the Cloudera Navigator 6.0 release. Spark lineage is supported for Spark 1.6 and Spark 2.3. Lineage is not available for Spark when Cloudera Manager is running in single user mode. In addition to these requirements, Spark lineage has the following limitations:
  • Lineage is produced only for data that is read/written and processed using the Dataframe and SparkSQL APIs. Lineage is not available for data that is read/written or processed using Spark's RDD APIs.
  • Lineage information is not produced for calls to aggregation functions such as groupBy().
  • The default lineage directory for Spark on Yarn is /var/log/spark/lineage. No process or user should write files to this directory—doing so can cause agent failures. In addition, changing the Spark on Yarn lineage directory has no effect: the default remains /var/log/spark/lineage.

Navigator doesn't recognize local files in Spark jobs

Spark jobs can use files on the local filesystem as job inputs or outputs. Navigator, however, only supports HFDS, Hive, and S3 assets as job inputs or outputs. When Navigator extracts metadata from Spark and encounters a local source type, the metadata is discarded and the following error appears in the Navigator Metadata Server log:

2018-10-11 12:14:26,192 WARN com.cloudera.nav.api.ApiExceptionMapper [qtp1574898980-23815]: Unexpected exception.
          java.lang.RuntimeException: Source LOCAL isn't supported for Spark Lineage

Affected Versions: Cloudera Navigator 6.0.0, 6.0.1, 6.1.0, 6.1.1

Fixed Versions: Cloudera Navigator 6.2.0

Cloudera Issue: NAV-6811

Spark extractor enabled using safety valve deprecated

The Spark extractor included prior to CDH 5.11 and enabled by setting the safety valve, nav.spark.extraction.enable=true is being deprecated, and could be removed completely in a future release. If you are upgrading from CDH 5.10 or earlier and were using the extractor configured with this safety valve, be sure to remove the setting when you upgrade.

Upgrade Issues and Limitations