By definition, Atlas provides a
scalable and core foundational services for Data Governance - enabling
enterprises to efficiently and effectively meet their compliance requirements
with their Hadoop eco system. However, it is a complex application, which is
built using the integration of various components in the Hadoop eco system.
Below are the components involved in the Architecture:
Atlas
Ranger
Hive
HBase
Ambari
Infra
Kafka
The intention of this article is to provide
troubleshooting tips if Ambari install is not functioning correctly.
Install validation and troubleshooting tips:
Make sure Atlas Metadata Server and Atlas
Metadata clients are installed from Ambari.
Install Ranger Tagsync component if you wish to
do authorization control using Atlas Tags in Ranger.
Make sure the below Kafka topics are created in
Kafka:
ATLAS_HOOK
ATLAS_ENTITIES
You can check this by
using the following command on any of the kafka brokers: (In Kerberized cluster
you need kafka key tab to run this command)
If the Kafka topics
didn’t get created, you can create them manually using the attached
atlas_kafka_acl.sh script. All you need to do is to update the Zookeeper quorum
in the script and run it on any Kafka broker.
Depending upon whether
you have Ranger in your environment and whether Kafka topic authorization is
controlled by Ranger, You should see necessary policies created in Ranger Kafka
repository in Ranger. If you don’t find those policies, you need to create
policies in Ranger Kafka repository in Ranger granting necessary accesses to
ATLAS_ENTITIES and ATLAS_HOOK topics. Below are how the policies should be set
up: in Ranger —> Resource based policies —> <Cluster Name>_kafka
Create some entities through beeline and the
event should trigger atlas hook to post the event to Kafka which is eventually
consumed by atlas user into Atlas.
Ranger audits tab and Atlas log should help in
debugging any kind of issues related to access denial while the entity is being
transferred to Atlas thru Kafka.
Incase if the Hive CLI
access is not working as expected in kerberized clusters, you can work with
Hortonworks support in resolving the issue.
Thanks man, thats indeed a complex application. My problem is Kafka is working, the topics are created, rangertagsync does not show any errors and still the tagsync between ranger and atlas is not working, meaning i dont see any Tags. Do you know how how can i troubleshoot ranger Tagsync precisely? Do you know which stations(components, ports, etc.) a Tag is taking frm ranger to Atlas and vice versa? I do havr the 2.5.3 with Atlas 0.7.0