Support Questions

Find answers, ask questions, and share your expertise

Atlas export linage via Kafka?

avatar
Master Guru

is it possible to export linage from atlas via kafka? I don't see that possible using the topics Atlas creates. However worth a ask on HCC.

1 ACCEPTED SOLUTION

avatar
Guru

@Sunile Manjee

Atlas creates two topics Atlas_Entities and Atlas_Hook. When a Hook fires it will send all of the meta data passed to it into Atlas entities and send them to Atlas via the Atlas_Hook topic. When Atlas successfully creates the new entities it received from aHook, it will publish the resulting entities to the Atlas_Entities topic. You can watch either topic to know that an entity or set of entities are being created or a request to create them has been sent. You can also go back and read the entire topic from the first available offset to see what entities or sets of entities have been created over that period. You can then calculate lineage using the same graph processing techniques used by Titan (the Graph API used by Atlas). However, there is no actual lineage information actually on the topic, just the JSON that describes the entities being created and references to other entities. This is because Kafka is nothing more than a message bus, it buffers messages for asynchronous read. It cannot do Graph calculation and even if it could, it only retains data for a limited period of time. Thus Atlas uses Titan to calculate lineage based on data stored in Hbase.

View solution in original post

3 REPLIES 3

avatar
Guru

@Sunile Manjee

Atlas creates two topics Atlas_Entities and Atlas_Hook. When a Hook fires it will send all of the meta data passed to it into Atlas entities and send them to Atlas via the Atlas_Hook topic. When Atlas successfully creates the new entities it received from aHook, it will publish the resulting entities to the Atlas_Entities topic. You can watch either topic to know that an entity or set of entities are being created or a request to create them has been sent. You can also go back and read the entire topic from the first available offset to see what entities or sets of entities have been created over that period. You can then calculate lineage using the same graph processing techniques used by Titan (the Graph API used by Atlas). However, there is no actual lineage information actually on the topic, just the JSON that describes the entities being created and references to other entities. This is because Kafka is nothing more than a message bus, it buffers messages for asynchronous read. It cannot do Graph calculation and even if it could, it only retains data for a limited period of time. Thus Atlas uses Titan to calculate lineage based on data stored in Hbase.

avatar
@Sunile Manjee

Currently there is no support to export lineage from atlas. But yes, this is part of the atlas roadmap which should be available in the near future.

avatar
Contributor

Is this support added now?