Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How Atlas know that Table is created in Hive

avatar
Contributor

I know that Atlas did it automatically when we make necessary configurations for import-hive.sh etc files. But I want to know internal working of Atlas. What happens behind the scene. Which class/module is invoked/involved?

1 ACCEPTED SOLUTION

avatar
Expert Contributor

@Muhammad Imran Tariq,

Atlas Hive Hook in configured in Hive by set-up in hive-site.xml.

   <property>
      <name>hive.exec.post.hooks</name>
      <value>org.apache.atlas.hive.hook.HiveHook</value>
    </property>

whenever a table is created in hive, a event is triggered which invokes the atlas hive hook. This hook sends message to Atlas via notification to kafka, The notification in kafka is consumed by Atlas and hive entity is created.

Refer:-

http://atlas.incubator.apache.org/Bridge-Hive.html

View solution in original post

6 REPLIES 6

avatar
Expert Contributor

@Muhammad Imran Tariq,

Atlas Hive Hook in configured in Hive by set-up in hive-site.xml.

   <property>
      <name>hive.exec.post.hooks</name>
      <value>org.apache.atlas.hive.hook.HiveHook</value>
    </property>

whenever a table is created in hive, a event is triggered which invokes the atlas hive hook. This hook sends message to Atlas via notification to kafka, The notification in kafka is consumed by Atlas and hive entity is created.

Refer:-

http://atlas.incubator.apache.org/Bridge-Hive.html

avatar
Contributor

cool. So what type of information Hook gets from Hive and how? I basically wants to know that. Did Hook query Hive to know information?

avatar
Expert Contributor

@Muhammad Imran Tariq

Following information is passed from Hive to Atlas-Hive Hook from Hive whenever any DML statement is executed when Atlas Hive hook is configured.

refer this code on github -

final HiveEventContext event = new HiveEventContext();
            event.setInputs(hookContext.getInputs());
            event.setOutputs(hookContext.getOutputs());
            event.setHookType(hookContext.getHookType());
            final UserGroupInformation ugi = hookContext.getUgi() == null ? Utils.getUGI() : hookContext.getUgi();
            event.setUgi(ugi);
            event.setUser(getUser(hookContext.getUserName(), hookContext.getUgi()));
            event.setOperation(OPERATION_MAP.get(hookContext.getOperationName()));
            event.setQueryId(hookContext.getQueryPlan().getQueryId());
            event.setQueryStr(hookContext.getQueryPlan().getQueryStr());
            event.setQueryStartTime(hookContext.getQueryPlan().getQueryStartTime());
            event.setQueryType(hookContext.getQueryPlan().getQueryPlan().getQueryType());
            event.setLineageInfo(hookContext.getLinfo());

avatar
Contributor

This means Hive will automatically notify Atlas. What if I have MySQL instead of Hive. I am sure MySQL will not notify like Hive. Right? If yes then what can be done to achieve notification in case of MySQL?

avatar
Expert Contributor

@Muhammad Imran Tariq, You are right MySQL will not notify like Hive, check if MySql has any mechanism to invoke the class / method when there is a change in metadata.

Do share your findings

Nixon

avatar
Expert Contributor

https://dev.mysql.com/doc/refman/5.7/en/writing-plugins.html

MySQL does offer capability to write plugins which might offer similar functionality as the Hive|AtlasHook.