Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Celebrating as our community reaches 100,000 members! Thank you!
Labels (1)
avatar
Expert Contributor

By definition, Atlas provides a scalable and core foundational services for Data Governance - enabling enterprises to efficiently and effectively meet their compliance requirements with their Hadoop eco system. However, it is a complex application, which is built using the integration of various components in the Hadoop eco system. Below are the components involved in the Architecture:

  1. Atlas
  2. Ranger
  3. Hive
  4. HBase
  5. Ambari Infra
  6. Kafka

The intention of this article is to provide troubleshooting tips if Ambari install is not functioning correctly.

Install validation and troubleshooting tips:

  • Make sure Atlas Metadata Server and Atlas Metadata clients are installed from Ambari.
  • Install Ranger Tagsync component if you wish to do authorization control using Atlas Tags in Ranger.
  • Make sure the below Kafka topics are created in Kafka:
  • ATLAS_HOOK
  • ATLAS_ENTITIES
  • You can check this by using the following command on any of the kafka brokers: (In Kerberized cluster you need kafka key tab to run this command)
/usr/hdp/current/kafka-broker/bin/kafka-topics.sh
—list —zookeeper <ZK1:2181,ZK2:2181,ZK3:2181>

This command should return the below results:

ATLAS_HOOK

ATLAS_ENTITIES

  • If the Kafka topics didn’t get created, you can create them manually using the attached atlas_kafka_acl.sh script. All you need to do is to update the Zookeeper quorum in the script and run it on any Kafka broker.
  • Depending upon whether you have Ranger in your environment and whether Kafka topic authorization is controlled by Ranger, You should see necessary policies created in Ranger Kafka repository in Ranger. If you don’t find those policies, you need to create policies in Ranger Kafka repository in Ranger granting necessary accesses to ATLAS_ENTITIES and ATLAS_HOOK topics. Below are how the policies should be set up: in Ranger —> Resource based policies —> <Cluster Name>_kafka
    • Policy Name: atlas_hook
    • Topic: ATLAS_HOOK
    • Allow Conditions:
    • Select Group(public), Permissions(Publish, Create)
    • Select User(atlas), Permissions(Consume, Create)
    • Policy Name: atlas_entities
    • Topic: ATLAS_ENTITIES
    • Allow Conditions:
    • Select Group(atlas), Permissions(Publish, Create)
    • Select User(rangertagsync), Permissions(Consume, Create)
  • Make sure the policies get synced to Kafka.
  • Create some entities through beeline and the event should trigger atlas hook to post the event to Kafka which is eventually consumed by atlas user into Atlas.
  • Ranger audits tab and Atlas log should help in debugging any kind of issues related to access denial while the entity is being transferred to Atlas thru Kafka.
  • Incase if the Hive CLI access is not working as expected in kerberized clusters, you can work with Hortonworks support in resolving the issue.
  • Please up vote if this article is helpful.
    2,980 Views
    Comments
    avatar
    Contributor

    Thanks man, thats indeed a complex application. My problem is Kafka is working, the topics are created, rangertagsync does not show any errors and still the tagsync between ranger and atlas is not working, meaning i dont see any Tags. Do you know how how can i troubleshoot ranger Tagsync precisely? Do you know which stations(components, ports, etc.) a Tag is taking frm ranger to Atlas and vice versa? I do havr the 2.5.3 with Atlas 0.7.0

    Anything would help!

    Thank you,

    Regards, Normen