Created 03-07-2017 06:40 PM
I am running a python script to associate an Attribute to a tag (one at a time because of GUID Constraint) and i have to tag around 3k attributes on daily basis. My script functions in below manner :
Run get command on a table and fetch GUID's of all columns available in the table, (i am using DSL Search here)
get_tables= requests.get(hostname +'/api/atlas/discovery/search/dsl?query=hive_table+where+name=%27'+tab+'%27and+__state=%27ACTIVE%27', headers={"Content-Type": "application/json","Accept": "application/json"}, auth=('username','password'))
Check whether the columns are tagged or not, If a column is not tagged, Fetch the GUID of that column and run a post command to associate it with a tag.
post_tag=requests.post((hostname+'/api/atlas/entities/'+guid+'/traits'), auth=(username ,password),json={ "typeName": "tag_name", "values": {}, "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Struct" },headers = {"Content-Type": "application/json","Accept": "application/json"})
Once the tag is associated, I am closing the HTTP connection.
get_tables.close() post_tag.close()
Now each of the API post call is taking more than 30 seconds to execute (which is not good) :(. Can someone please let me know an efficient method to tag attributes.
Thank you in advance,
Subash
Created 03-15-2017 07:12 AM
@subash sharma Do you see any debug logging in the application log while this POST commands are executed? could you please share the debug logs?
Also, are you executing this POST command from the same machine as the atlas is running? This might also be happening due to network delay. I would recommend to do such bulk operations from the same machine as Atlas to avoid any network delays.
Created 03-15-2017 07:33 AM
Sure @Ayub Khan, We were able to resolve the issue by Restarting Apache Atlas (To close All Web Connections) and Multi-Processing to execute REST API command
Created 03-08-2017 06:03 PM
A2A @Ayub Khan and @sshivaprasad. Thanks guys.
Created 03-15-2017 07:12 AM
@subash sharma Do you see any debug logging in the application log while this POST commands are executed? could you please share the debug logs?
Also, are you executing this POST command from the same machine as the atlas is running? This might also be happening due to network delay. I would recommend to do such bulk operations from the same machine as Atlas to avoid any network delays.
Created 03-15-2017 07:35 AM
Glad that the issue is resolved. Please close the loop by accepting the answer.
Created 03-15-2017 07:33 AM
Sure @Ayub Khan, We were able to resolve the issue by Restarting Apache Atlas (To close All Web Connections) and Multi-Processing to execute REST API command