Support Questions

Find answers, ask questions, and share your expertise

Is order really important for hook notifiication?

avatar
Explorer

Hi all,

I've been reading the Atlas code and understanding the Atlas architecture recently. Among them, the choice of Kafka as a message mechanism got my interest.

We know that Kafka is design for order message processing, but is order really important for Hooks to report change?

Thanks,

Eva

1 ACCEPTED SOLUTION

avatar
Super Collaborator

Yes , it is important.

Consider there are 2 events from Hive :

1. Rename an Hive table ( example : employee to employee_personal)

2. Add a column to the renamed Hive table. ( add address field to employee_personal)

When Atlas Hive hook is configured , messages are sent for the above 2 events.

Say , If message #2 is received first by Atlas first , employee_personal is not yet known to Atlas. Hence Atlas creates employee_personal hive_table entity with address field column + other columns.

then , when message #1 is received , Atlas renames existing employee hive_table entity to employee_personal .

Now , there are 2 employee_personal entities in Atlas, whereas in Hive , there is only 1 employee_personal table

Hence , order is *very* important for Atlas being a Governance and Metadata management framework!

View solution in original post

3 REPLIES 3

avatar
Super Collaborator

Yes , it is important.

Consider there are 2 events from Hive :

1. Rename an Hive table ( example : employee to employee_personal)

2. Add a column to the renamed Hive table. ( add address field to employee_personal)

When Atlas Hive hook is configured , messages are sent for the above 2 events.

Say , If message #2 is received first by Atlas first , employee_personal is not yet known to Atlas. Hence Atlas creates employee_personal hive_table entity with address field column + other columns.

then , when message #1 is received , Atlas renames existing employee hive_table entity to employee_personal .

Now , there are 2 employee_personal entities in Atlas, whereas in Hive , there is only 1 employee_personal table

Hence , order is *very* important for Atlas being a Governance and Metadata management framework!

avatar
Explorer

Thank you very much Sharmadha!

But I still have one puzzle regarding to the order, take the example you presented:

1. say message#1 is sent but not processed yet.

2. when action#2 is taken, hook will consider employee_personal is not yet known to Atlas

3. in this case the entities contained in message#2 will have a new entity about this table, will that also result in 2 employee_personal entities in Atlas?

Thanks,

Eva

avatar
Super Collaborator

Eva Xiao , Messages are processed in the order they are received by Atlas.

If message#1 has any error , only then message#2 is processed and action#3 specified by you can happen.