Created on 06-23-2016 02:22 PM - edited 08-18-2019 05:34 AM
Question 1:
What is the practical applications of audit of Atlas ?
According the official guide: http://atlas.apache.org/, it describes the effect of audit as following:
But these description are so abstract, I think. I am wondering what the specific use case of audit is.
~
Question 2:
How to configure and use the audit? I never find the configuring information in the official guide.
~
Question 3:
I remember that, the Atlas Web UI of old version Atlas has a Audit tag which could be clicked in the browser. But I never find the audit tag in the Web UI of Atlas 0.7 version. Why?
Created 06-24-2016 08:19 PM
1. Audit in this case means the ability to see how the data set was first created and how it has been altered since landing on the cluster. Since Atlas 0.6, this the ability to track data coming through and from the following components.
Other components can use the Atlas REST and Java API to register lineage. Check out this repo for an example of an Apache Nifi Reporting Task that registers provenance with Atlas.
https://community.hortonworks.com/content/repo/39432/nifi-atlas-lineage-reporter.html
Once components of the modern data application are integrated with Atlas, concepts like data governance and tag based security policies when combined with Apache Ranger, become possible.
2. Check out this lab to get an understanding of how to enable and navigate cross component data set level lineage.
http://hortonworks.com/hadoop-tutorial/cross-component-lineage-apache-atlas/
http://hortonworks.com/hadoop-tutorial/tag-based-policies-atlas-ranger/
3. You need to search for and find entities before you can assign tags to them. Try the labs in the above links and along the way you should see the links to add tags, also referred to as traits, to the resulting entities.
Created 06-24-2016 08:19 PM
1. Audit in this case means the ability to see how the data set was first created and how it has been altered since landing on the cluster. Since Atlas 0.6, this the ability to track data coming through and from the following components.
Other components can use the Atlas REST and Java API to register lineage. Check out this repo for an example of an Apache Nifi Reporting Task that registers provenance with Atlas.
https://community.hortonworks.com/content/repo/39432/nifi-atlas-lineage-reporter.html
Once components of the modern data application are integrated with Atlas, concepts like data governance and tag based security policies when combined with Apache Ranger, become possible.
2. Check out this lab to get an understanding of how to enable and navigate cross component data set level lineage.
http://hortonworks.com/hadoop-tutorial/cross-component-lineage-apache-atlas/
http://hortonworks.com/hadoop-tutorial/tag-based-policies-atlas-ranger/
3. You need to search for and find entities before you can assign tags to them. Try the labs in the above links and along the way you should see the links to add tags, also referred to as traits, to the resulting entities.
Created 06-27-2016 03:33 AM
I am trying to import metadata from Hive into Atlas. when I created a table in Hive CLI, run {atlas_home}/bin/import-hive.sh and I successfully imported the metadata, the Atlas Web UI showed that no lineage data was found.
In my opinion, it should show the lineage between Hive and Atlas, but it showed nothing.
How can I let it show the lineage when I run {atlas_home}/bin/import-hive.sh?
Thank you very much.
Created 06-27-2016 12:11 PM
The entities don't necessarily have any lineage when they are first created. Hive tables that were not the source of some other table or registered data structure will show no lineage. Try creating a new table from the existing table that is already registered using Create table --- select --- from --- statement. That should create another table from the existing hive table and the hive hook should register that lineage. You should then be able to see lineage form the parent and child table.
Created 06-30-2016 03:19 AM
Thank you very much. It really help me to understand the meaning of lineage.
And there are two simple question:
Q1: If I use Sqoop Bridge, does it mean that I can transmit the metadata from DBMS(eg: MySQL) to Atlas.
And it is not necessary to use Sqoop deliver the data from MySQL into Hive, then deliver the data from MySQL into Atlas.
In one word, by using Sqoop Bridge, if I can deliver metadata from MySQL to Atlas without Hive ?
Q2: When I create another table from the existing hive table in Hive CLI, it will search the JAR files in the HDFS path, but these JAR files are located in the local file system, not in the HDFS.
How could I change the path of needed JAR files ?
The details of this question is here:
I hope you can help me. Thank you very much. Thank you.