Hi all. I am working on a hadoop cluster that needs to implement some form of data governance (for my purposes, say just data lineage/provenance for tracking the history of files in the HDFS). This cluster uses MapR's hadoop implementation and as such is incompatible (https://community.mapr.com/thread/22983-using-apache-atlas-on-mapr) with the opensource data gov. tool apache atlas (https://atlas.apache.org/). Are there any viable opensource alternatives for MapR or other clusters unable to use atlas for whatever reason? Any conventional/best-practices system designs that can be self implemented to track file lineages? Thanks.
... View more