Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Unable to upload related entities separately

avatar
New Contributor

Hi Team,

I am working with Apache Atlas using PyApacheAtles to work on Azure Purview. I already created custom types (table and column) and their relationship definition. In my system there are about 30k entities to upload, so when I try to push all of them in one batch I receive timeout.

 

I tried to apply the logic of upload from Atlas Jira  https://issues.apache.org/jira/browse/ATLAS-4389

Firstly upload all parents (tables in my case), then columns (related with tables). After successful upload of tables batch, I received an error, when columns batch upload started

"errorCode":"ATLAS-404-00-00A","errorMessage":"Referenced entity -1001 is not found"

 -1001 is a guid of the table, which already is uploaded. I noticed that in case of upload table and column in one batch everything works fine.

 

It looks like Atlas checks if relationship exists in uploaded batch, not between batch and already uploaded entities.  Is there any way to upload related entities in separate batches or should them be uploaded only in one batch? Do you have another strategy to avoid timeouts during bulk upload?

0 REPLIES 0