Created 03-02-2018 07:27 AM
In atlas, I want to add a new column (Ex : Data Custodian) in hive_table ?
Is it possible ?
If its possible , Please explain ?
Thanks in advance...!!!!
Created 03-06-2018 06:55 AM
For this requirement , please look at Classification. You can create a classification/tag with attributes. For example , create a tag named PI with required attributes like expiry date etc., and associate it to the hive_table entity.
The attributes like columns , comments , aliases ,comment,createTime,db etc., are specific to Hive model. Information like Data custodian, Data owner & PI Information are not available in Hive. So it is not advisable to add such attributes to Hive model in Atlas. But,you may very well classify data based on tags - which is the recommended way.
Once the table is associated to tag , you can query for the tag using search APIs , and it would list all the entities associated to the tag.
For example ,
1. Create a tag named PI with attribute expiry date of type date.
2. Associate the tag PI to the hive_table entity with date value for expiry date.
3.Now you can query for the tag PI with the particular expiry date.
Please let me know if you need some more information on this.
Created 03-02-2018 08:28 AM
I hope you have enabled Atlas Hive hook settings. If yes , all updates to the hive table are captured by Atlas. When column is added in Hive, you can find the newly created hive_column entity in Atlas.
Created 03-02-2018 09:49 AM
Thanks for u r quick response @Sharmadha Sainath , I want new KEY in Atlas UI for type=hive_table , Not a user created table column ,
Examples how aliases ,columns,comment,createTime,db, description,lastAccessTime etc fields ...like that one new column
Created 03-02-2018 12:04 PM
hive_table is a type and fields you mentioned like aliases ,columns,comment,createTime,db etc., are attributes of hive_table.
Type can be updated using PUT (http://atlas.apache.org/api/v2/resource_TypesREST.html#resource_TypesREST_updateAtlasTypeDefs_PUT).
This requires fetching the type definition and updating with new attribute.
For example ,
Following GET REST call is used to fetch the hive_table type definition :
http://atlashost:21000/api/atlas/v2/types/entitydef/name/hive_table
After fetching the type definition , new attribute definition can be added in the attributeDefs array as
{ name: "new_attribute", typeName: "string", isOptional: true, cardinality: "SINGLE", valuesMinCount: 0, valuesMaxCount: 1, isUnique: false, isIndexable: false }
name : name of the new attribute
typename : data type of the attribute
isOptional : if the entity can be created without providing values for the attribute. (Note : updating a type with new mandatory attribute is not allowed. While updating , provide isOptional as True).
and the updated JSON can be PUT to
http://atlashost:21000/api/atlas/v2/types/typedefs
For example , in the text file attached , I have added new attribute definition . GUID of the hive_table has to be modified based on your Atlas instance.
Please let me know if you are stuck somewhere in this procedure.
One question :
hive_table is a defined type in Atlas.It has all attributes which will be required for maintaining hive meta data. May I know why you want to update it ? What is the new attribute you want to add ? could you please explain the use case behind it ?
Created 03-06-2018 06:10 AM
@Sharmadha Sainath , Thanks for your quick response,
I need to maintain additional metadata for those hive_table, This information is given by Business, This is Business metadata information is like Data custodian, Data owner & PI Information etc ...These attributes are not available in TYPE :hive_table in Atlas, So i want to create a new attributes TYPE :hive_table and Need to move these information in new attributes
Created 03-06-2018 06:55 AM
For this requirement , please look at Classification. You can create a classification/tag with attributes. For example , create a tag named PI with required attributes like expiry date etc., and associate it to the hive_table entity.
The attributes like columns , comments , aliases ,comment,createTime,db etc., are specific to Hive model. Information like Data custodian, Data owner & PI Information are not available in Hive. So it is not advisable to add such attributes to Hive model in Atlas. But,you may very well classify data based on tags - which is the recommended way.
Once the table is associated to tag , you can query for the tag using search APIs , and it would list all the entities associated to the tag.
For example ,
1. Create a tag named PI with attribute expiry date of type date.
2. Associate the tag PI to the hive_table entity with date value for expiry date.
3.Now you can query for the tag PI with the particular expiry date.
Please let me know if you need some more information on this.
Created 03-06-2018 08:02 AM
@Sharmadha Sainath
Tags creation we have to do manually in UI , Every day i will get the updated metadata, So cannot modify TAGS on everyday , If some attributes are exists , Then i will make job with CURL Command to override on daily basis with updated data ,
Can we create a TAG's in back end ?
Created 03-06-2018 12:50 PM
yes , POST the JSON body attached in the file to
http://localhost:21000/api/atlas/v2/types/typedefs?type=classification
In the tag definition , name is the name of the tag , and attributeDefs is an JSON array of attribute definitions. I have added expiry_date attribute of type date in the example.
Once the tag is created , the tag can be associated to the hive_table entity by POSTing the attached tag-association.txt to
http://localhost:21000/api/atlas/v2/entity/bulk/classification
in tag-association.txt , "name" is the name of tag . attribute values can be provided in "attributes". entityGuids is the list of all the GUIDs of entity the tag should be associated to. In this array , you can provide the hive_table GUID.
Created 03-07-2018 08:02 AM
Can u please tell me the easy way to execute JSON script from edge node ?
I tried with below CURL Command ,
curl -vX POST -u admin:admin http://Localhost:2100/api/atlas/v2/types/entitydef/name/hive_table.json -d user/testplace.json --header "Content-Type: application/json"
showing below error :
< HTTP/1.1 100 Continue < HTTP/1.1 500 Internal Server Error < Set-Cookie: ATLASSESSIONID=1m6u83lz9dwylwfy2b702hfj9;Path=/;HttpOnly < Expires: Thu, 01 Jan 1970 00:00:00 GMT < X-Frame-Options: DENY < Content-Type: text/plain < Transfer-Encoding: chunked < Server: Jetty(8.1.19.v20160209) * HTTP error before end of send, stop sending
Please help me to execute json script from edge node ?
Thanks in advance...!!!
Created 03-07-2018 08:35 AM
1. Please use correct port . By default , Atlas in non-SSL environment is configured to use 21000.
2. curl requires "@" for providing files. Example : -d @user/testplace.json
3. To update type , "PUT" (not POST. To create types use POST, to update types use "PUT") the JSON to
http://atlashost:21000/api/atlas/v2/types/typedefs
4. As already mentioned , classification/tag best suits your requirement. Its highly recommended to use tags instead of updating types.