How does metadata related to Lineage daigrams captured and displayed in Cloudera Navigator applicaiton? I notice it lists, mapreduce, YARN, Sqoop, Oozie, spark operations in api documentation. If data in hadoop cluster (hive/hdfs datasets) are manipulated/transfored by using 3rd pary tools, how is the transforamtion of operation can be captured? is there a way to include operations from 3rd party tools? ex: if talend or someother ETL tool is used on take hive/hdfs files and generate hive/hdfs tables, whether Talend operations be included as part of Lineage?
Hi there, Navigator cannot automatically infer whether a hadoop operation was launched by a 3rd party application, but a number of our partners are using the Navigator SDK to explicitly publish custom metadata into Navigator and connect it with hadoop datasets and operations. For those applications that use this, you'll be able to see the 3rd party transformations in Navigator. You can also do this with your custom operations (see the examples in the SDK for how to do this).