Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Data Lineage Graph with hive views

avatar

The data lineage graph generated by Apache Atlas when Hive view are implicated presents some dilemma. In fact, the hive process that contribute to creation of hive view doesn’t bind only with the views declared in the query but to the views that contribute to creation of those views and recursively until reach table that contributes to creation of views.

I want to know if this lineage is a part of the philosophy of apache atlas in presenting data lineage graph containing hive views or it could be a non-tested case and then should be adjusted.

1 ACCEPTED SOLUTION

avatar

@Radhouene EL HADJ EL ARBI

This is by design. Atlas, and most governance tools in general, will trace lineage as far back as possible. With Atlas, not only will it go back to the root table(s), it can even go as far back as the Storm or Sqoop job that ingested the data to the original tables.

The purpose of having lineage this far back is for a user to be able to effectively trace back the origins of data, whether to validate data quality, for compliance, or even just to understand how the data has mutated/evolved to it's current state.

View solution in original post

1 REPLY 1

avatar

@Radhouene EL HADJ EL ARBI

This is by design. Atlas, and most governance tools in general, will trace lineage as far back as possible. With Atlas, not only will it go back to the root table(s), it can even go as far back as the Storm or Sqoop job that ingested the data to the original tables.

The purpose of having lineage this far back is for a user to be able to effectively trace back the origins of data, whether to validate data quality, for compliance, or even just to understand how the data has mutated/evolved to it's current state.