How to create a policy for a file in HDFS


I have known that hortonworks established a Data Governance, and with the the integration of Apache Atlas and Apache Ranger to create a policy:

I would like use Atlas and Ranger to apply a data governance on my DataLake. I found this article is useful:

To classify the data, I should create a tag then I add it to an entity. But I think it's working just for Hive, because Atlas currently provides metadata services for the following components:




Storm/Kafka (limited support)

Falcon (limited support)

And HDFS is not officially part of Atlas current roadmap.

Is this information true in HDP 2.6.3 ? If no, someone can give me a suggest or an article how to do the policy of HDFS vi Atlas and Ranger ?





As of now Atlas doesn't support HDFS hook (HDFS files lineage) and it may be implemented in the future releases.


Exactly, but I found this article bellow, that currently there is no Atlas hook for HBase, HDFS, or Kafka. For these components, you must manually create entities in Atlas. You can then associate tags with these entities and control access using Ranger tag-based policies. So can we create an entity then we add a tag with it ?

@SMACH H You are right, we create an entity then we add a tag with it as mentioned in the doc.