Member since
07-25-2018
174
Posts
29
Kudos Received
5
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5442 | 03-19-2020 03:18 AM | |
3471 | 01-31-2020 01:08 AM | |
1352 | 01-30-2020 05:45 AM | |
2608 | 06-01-2016 12:56 PM | |
3097 | 05-23-2016 08:46 AM |
01-03-2017
10:53 AM
Hi Ayub, We have two table i.e. table4 and table5 having columns Id, Name, Age. we have inserted entity metadata, lineage metadata of both table in Atlas and able to see the schema and lineage graph in Atlas. After that I have deleted the entity metadata of table5 and reinserted entity metadata of table5. Next I have inserted the same lineage metadata (earlier lineage JSON metadata) of both table in atlas; however not able to see the lineage graph of two table. getting below response message from atlas server. {"requestId":"qtp662559856-30620 -
26dbb640-9629-4c29-b209-32331e52962e","entities":{}} Please find here the below lineage JSON metadata and let me know the mistake I have done. lineagejson.txt so after deleting table hive table entity and reinserting same metadata(i.e. hive table entity JSON data) we are unable to see the lineage in atlas ,so what could be the reason behind this? what actually mistake I am making in lineage json second time because of which I am not getting lineage? Do I need to change the value of process id or process name in JSON ?
... View more
01-03-2017
07:01 AM
Hi Ayub, The above issue is solved. Actually there was mistake in JSON. Whenever we have multiple columns in a table we must have to provide different random GUID long number (not same for all the column) even though it's negative number only in such a case Apache atlas will able to different column names otherwise,will get same name for all columns in Apache Atlas UI. To make this work I have just provided different random ID for each columns as follows: Please find attached the correct JSON file :
... View more
01-03-2017
05:58 AM
Thank you Ayub,
it's working fine for me,
I have one query here, I have one table table5 has columns id, name and age, After inserting the table5 metadata into Atlas I am getting the repeated column I.e. name only. however not getting metadata for Id and age column. Please find here the below table5 JSON & let me know if any mistake is there in JSON. Please find attached Atlas image I am getting output like shown in image. atlas-snapshot.png [
{
"traits":{
},
"traitNames":[
],
"values":{
"ownerType":2,
"owner":"root",
"qualifiedName":"default@Sandbox",
"clusterName":"Sandbox",
"name":"default",
"description":"emr hive database",
"location":"hdfs:\/\/sandbox.hortonworks.com:8020\/apps\/hive\/\/warehouse",
"parameters":{
}
},
"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference",
"typeName":"hive_db",
"id":{
"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
"typeName":"hive_db",
"id":"-11893021824425525",
"state":"ACTIVE",
"version":0
}
},
{
"traits":{
},
"traitNames":[
],
"values":{
"owner":"root",
"temporary":false,
"lastAccessTime":"2017-01-03T11:02:53.000Z",
"qualifiedName":"default.table5@Sandbox",
"columns":[
{
"traits":{
},
"traitNames":[
],
"values":{
"owner":"root",
"qualifiedName":"default.table5.name@Sandbox",
"name":"name",
"type":"string",
"table":{
"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
"typeName":"hive_table",
"id":"-11893021824425524",
"state":"ACTIVE",
"version":0
}
},
"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference",
"typeName":"hive_column",
"id":{
"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
"typeName":"hive_column",
"id":"-11893021824425522",
"state":"ACTIVE",
"version":0
}
},
{
"traits":{
},
"traitNames":[
],
"values":{
"owner":"root",
"qualifiedName":"default.table5.id@Sandbox",
"name":"id",
"type":"int",
"table":{
"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
"typeName":"hive_table",
"id":"-11893021824425524",
"state":"ACTIVE",
"version":0
}
},
"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference",
"typeName":"hive_column",
"id":{
"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
"typeName":"hive_column",
"id":"-11893021824425522",
"state":"ACTIVE",
"version":0
}
},
{
"traits":{
},
"traitNames":[
],
"values":{
"owner":"root",
"qualifiedName":"default.table5.age@Sandbox",
"name":"age",
"type":"int",
"table":{
"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
"typeName":"hive_table",
"id":"-11893021824425524",
"state":"ACTIVE",
"version":0
}
},
"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference",
"typeName":"hive_column",
"id":{
"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
"typeName":"hive_column",
"id":"-11893021824425522",
"state":"ACTIVE",
"version":0
}
}
],
"tableType":"MANAGED_TABLE",
"sd":{
"traits":{
},
"traitNames":[
],
"values":{
"qualifiedName":"default.table5@Sandbox_storage",
"storedAsSubDirectories":false,
"location":"hdfs:\/\/sandbox.hortonworks.com:8020\/apps\/hive\/warehouse\/table5",
"compressed":false,
"inputFormat":"org.apache.hadoop.mapred.TextInputFormat",
"outputFormat":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat",
"parameters":{
},
"serdeInfo":{
"values":{
"serializationLib":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe",
"parameters":{
"serialization.format":"1"
}
},
"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Struct",
"typeName":"hive_serde"
},
"table":{
"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
"typeName":"hive_table",
"id":"-11893021824425524",
"state":"ACTIVE",
"version":0
},
"numBuckets":-1
},
"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference",
"typeName":"hive_storagedesc",
"id":{
"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
"typeName":"hive_storagedesc",
"id":"-11893021824425523",
"state":"ACTIVE",
"version":0
}
},
"createTime":"2017-01-03T11:02:53.000Z",
"name":"table5",
"partitionKeys":[
],
"parameters":{
"totalSize":"0",
"rawDataSize":"0",
"numRows":"0",
"COLUMN_STATS_ACCURATE":"{\"BASIC_STATS\":\"true\"}",
"numFiles":"0",
"transient_lastDdlTime":"1482917693"
},
"db":{
"traits":{
},
"traitNames":[
],
"values":{
"ownerType":2,
"owner":"root",
"qualifiedName":"default@Sandbox",
"clusterName":"Sandbox",
"name":"default",
"description":"emr hive database",
"location":"hdfs:\/\/sandbox.hortonworks.com:8020\/apps\/hive\/\/warehouse",
"parameters":{
}
},
"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference",
"typeName":"hive_db",
"id":{
"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
"typeName":"hive_db",
"id":"-11893021824425525",
"state":"ACTIVE",
"version":0
}
},
"retention":0
},
"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference",
"typeName":"hive_table",
"id":{
"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
"typeName":"hive_table",
"id":"-11893021824425524",
"state":"ACTIVE",
"version":0
}
}
]
... View more
12-29-2016
12:27 PM
Thanks to see you gain Ayub, Could you please post what changes you have made in above json? did you change guid somewhere to link to dataset or something else?
... View more
12-29-2016
11:38 AM
Hi Guys, I able able to create lineage(i.e hive_process) between two dataset in apache atlas,i have referred below link to complete this task Link: https://community.hortonworks.com/questions/74875/how-to-create-hive-table-entity-in-apache-atlas-us.html#comment-75132 I am able to set lineage between table1 and table2 successfully but now my requirement like, Consider,I already have created hive table using hive query, it's metadata is also present in altas and I want to link or create lineage between this already created table and the one which i will going to create using REST API,to do this what changes I need to make in json file which we are using to create hive_process? which one is that property, you have set in json file because of it we can link table1 and table2?
... View more
Labels:
- Labels:
-
Apache Atlas
12-29-2016
11:36 AM
Hi Ayub, I am able to set lineage between table1 and table2 successfully but now my requirement like, Consider,I already have created hive table using hive query, it's metadata is also present in altas and I want to link or create lineage between this already created table and the one which i will going to create using REST API,to do this what changes I need to make in json file which we are using to create hive_process? which one is that property, you have set in json file because of it we can link table1 and table2?
... View more
12-29-2016
11:16 AM
Hi Ayub, As we have created two dataset entities and set the lineage between them also. Consider I have already created hive table(i.e .patient_raw_info) and it's metadata is also present in atlas and now I want to create lineage between already exist dataset(i.e. patient_raw_info) and the one which I will going to create by using your REST API (i.e. patient_validated_dataset) so my question is How can I create hive_process between already exist dataset and the other one? what changes I need to make in json file which we are using to create hive_process (i.e. lineage) ? I can create third table(i.e. hive_entity) by using same json file that is fine but what about json data for lineage? How can I link them from, patient_raw_info--->patient_validated_dataset
... View more
12-29-2016
10:35 AM
Hi Ayub, As we have created two dataset entities and set the lineage between them also,now my requirement is like , Consider I have already created hive table using hive query(i.e. patient_info_raw), it's metadata is also present in atlas repository and now I want to create lineage between this existing dataset and the one which I will create by using POST api (i.e. patient_validated_info). so what changes I need to make in json file of lineage data (i.e. in 3rd step)? so that I can see the lineage I can create third table(i.e. hive_entity) by using same json file that is fine but what about json data for lineage? How can I link them from patient_info_raw--->patient_validated_info.
... View more
12-29-2016
05:54 AM
Hi Abdelkrim, now I am able to create hive table entities and successfully linked those entities also using atlas REST api. Please follow the step from below link: https://community.hortonworks.com/questions/74875/how-to-create-hive-table-entity-in-apache-atlas-us.html#comment-74964
... View more
12-29-2016
05:50 AM
Thank you Ayub, I checked your json on HDP 2.5 and it's working fine their.
... View more