Hi All, just wanted to get your input regarding what tools (preferably open source) are available that can tag data in HDFS; from what I understand Atlas is not able to do that. Heard about Waterline, any others that you guys know of.
waterline has feature called hdfs crawler which uses a algorithm to tag data. Attivio is another tool which can tag data based on a data mart concept. Both tools are best in class in my opinion.
View solution in original post
Here is the link to a short tutorial with Waterline using a HDP Sandbox created jointly by Hortonworks and Waterline:
Manage your Data Lake more efficiently with Waterline Data Inventory and HDP