Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Atlas: Is there a way to tag a mass amount of tables at once in Atlas?

avatar
New Contributor

I'm trying to tag thousands of tables in Atlas based on the sensitivity level of the data. Does anyone know of a way to do this without manually tagging each table? I've read about the REST API but can't find a solution in the docs.

1 ACCEPTED SOLUTION

avatar
Super Collaborator

@Aaron Mayo

This can be accomplished using V2 APIs.

POST request body example :

{ "classification":{ "typeName":"PII","attributes":{ "attrib1":"value1","attrib2":"value2"}},"entityGuids":[ "05c97069-dc36-4f26-b017-13582c42428a","4b3fb1fa-0755-4329-8ecb-7e53e18ed128","6b3e5b42-e09b-428a-89ad-9ae38690044a","7593f4ed-9cd2-46ba-b0a7-b92229301476","f21ee4aa-461c-4eeb-8945-31ac8ec648d6"]} 
API:

In the JSON ,

i) PII is the tag name

ii)attributes is the attribute name , value map

iii)"entityGuids" json array is the list of the GUIDs of the entities that have to be tagged.

View solution in original post

3 REPLIES 3

avatar
Super Collaborator

@Aaron Mayo

This can be accomplished using V2 APIs.

POST request body example :

{ "classification":{ "typeName":"PII","attributes":{ "attrib1":"value1","attrib2":"value2"}},"entityGuids":[ "05c97069-dc36-4f26-b017-13582c42428a","4b3fb1fa-0755-4329-8ecb-7e53e18ed128","6b3e5b42-e09b-428a-89ad-9ae38690044a","7593f4ed-9cd2-46ba-b0a7-b92229301476","f21ee4aa-461c-4eeb-8945-31ac8ec648d6"]} 
API:

In the JSON ,

i) PII is the tag name

ii)attributes is the attribute name , value map

iii)"entityGuids" json array is the list of the GUIDs of the entities that have to be tagged.

avatar
New Contributor

@sharmadha would you know how to get the GUIDs from tables in hive?

avatar
Super Collaborator

@Aaron Mayo

GUIDS are generated by Atlas.Each Atlas entity has a unique GUID. You can get GUID of a table from UI , or by firing search query and writing a script on top of it to parse the GUIDS in the resulting JSON. For example , if you want to fetch the GUIDs of all tables in database default ,

DSL query = hive_db where db.name="default"

Encoded :

http://atlas_host:21000/api/atlas/v2/search/dsl?offset=0&query=db.name%3D%22default%22&typeName=hive...

Following is the JSON response attached as an image :

dsl-query-result.png

From this ,

json["entities"][0]["guid"],json["entities"][1]["guid"] ... json["entities"][n]["guid"] can be extracted.

If you want to get GUID of only 1 table , instead of writing script , you can get it directly from the UI. When you click on an hive_table entity in Atlas, it takes you to http://atlas_host:21000/#!/detailPage/<GUID of the entity>

Example : http://atlas_host:21000/#!/detailPage/c44d0207-3567-4573-baaf-577ecbb8e195

Here c44d0207-3567-4573-baaf-577ecbb8e195 is the GUID of the hive_table entity .

Since , going to UI and getting the GUID for all tables is tedious and manual process , it is preferred to fire query which fetches required tables and fetch GUIDS using a script.