Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Data lineage for HDFS files / Hook Atlas for HDFS

Data lineage for HDFS files / Hook Atlas for HDFS

Contributor

Hello,

I use HDP 2.6. I would like to create data lineage for Hdfs files with Apache Atlas. The ideal solution uses Apache Atlas to make this data lineage probably via an Hdfs Hook. But this current version seems there is no Hdfs Hook.

So I am asking you some advices :

- have I create a HDFS entities to manage this lineage ?

- Have I create a Hive external table to realize this lineage ?

Than you for your help.

2 REPLIES 2
Highlighted

Re: Data lineage for HDFS files / Hook Atlas for HDFS

Expert Contributor

You are right about this. There isn't a HDFS hook.

Right now, the option available is to create entities from within the Atlas Web UI.

Steps:

  1. Select 'create new entity'.
  2. In the 'Create entity' pop-up chose 'hdfs_path'.
  3. Fill in the details. In the path add the path that you need represented by this entity.

This entity will be available via search.

Lineage is tracked for Hive entities.

Hope this helps.

Highlighted

Re: Data lineage for HDFS files / Hook Atlas for HDFS

Contributor

@Ashutosh Mestry

Thank you for your reply. Probably it would be interesting to develop a specific bridge for HDFS?

Don't have an account?
Coming from Hortonworks? Activate your account here