Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Data governance (mostly lineage) system designs or opensource alternatives for hadoop without apache atlas?

Highlighted

Data governance (mostly lineage) system designs or opensource alternatives for hadoop without apache atlas?

New Contributor

Hi all. I am working on a hadoop cluster that needs to implement some form of data governance (for my purposes, say just data lineage/provenance for tracking the history of files in the HDFS). This cluster uses MapR's hadoop implementation and as such is incompatible (https://community.mapr.com/thread/22983-using-apache-atlas-on-mapr) with the opensource data gov. tool apache atlas (https://atlas.apache.org/). Are there any viable opensource alternatives for MapR or other clusters unable to use atlas for whatever reason? Any conventional/best-practices system designs that can be self implemented to track file lineages? Thanks.

Don't have an account?
Coming from Hortonworks? Activate your account here