Support Questions
Find answers, ask questions, and share your expertise

Data lineage for HDFS files / Hook Atlas for HDFS

Contributor

Hello,

I use HDP 2.6. I would like to create data lineage for Hdfs files with Apache Atlas. The ideal solution uses Apache Atlas to make this data lineage probably via an Hdfs Hook. But this current version seems there is no Hdfs Hook.

So I am asking you some advices :

- have I create a HDFS entities to manage this lineage ?

- Have I create a Hive external table to realize this lineage ?

Than you for your help.

2 REPLIES 2

Expert Contributor

You are right about this. There isn't a HDFS hook.

Right now, the option available is to create entities from within the Atlas Web UI.

Steps:

  1. Select 'create new entity'.
  2. In the 'Create entity' pop-up chose 'hdfs_path'.
  3. Fill in the details. In the path add the path that you need represented by this entity.

This entity will be available via search.

Lineage is tracked for Hive entities.

Hope this helps.

Contributor

@Ashutosh Mestry

Thank you for your reply. Probably it would be interesting to develop a specific bridge for HDFS?