- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Issue about Atlas lineage?
- Labels:
-
Apache Atlas
Created 12-29-2016 11:38 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Guys,
I able able to create lineage(i.e hive_process) between two dataset in apache atlas,i have referred below link to complete this task
Link:
I am able to set lineage between table1 and table2 successfully but now my requirement like,
Consider,I already have created hive table using hive query, it's metadata is also present in altas and I want to link or create lineage between this already created table and the one which i will going to create using REST API,to do this
what changes I need to make in json file which we are using to create hive_process?
which one is that property, you have set in json file because of it we can link table1 and table2?
Created 12-29-2016 11:46 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As an extension to what was answered here, just create another table named table3 and submit the below json using /api/atlas/entities REST API.
[{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference", "id":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "id":"-11893021824425513", "version":0, "typeName":"hive_process", "state":"ACTIVE" }, "typeName":"hive_process", "values":{ "queryId":"hive_20161228094619_81b13647-4f7f-4f1b-9c08-0f64eb8dbb34", "name":"create table table3 as select * from table2", "startTime":"2016-12-28T09:46:19.003Z", "queryPlan":{ }, "operationType":"CREATETABLE_AS_SELECT", "outputs":[ { "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference", "id":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "id":"-11893021824425516", "version":0, "typeName":"hive_table", "state":"ACTIVE" }, "typeName":"hive_table", "values":{ "tableType":"MANAGED_TABLE", "name":"table3", "createTime":"2016-12-28T09:46:30.000Z", "temporary":false, "db":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference", "id":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "id":"-11893021824425517", "version":0, "typeName":"hive_db", "state":"ACTIVE" }, "typeName":"hive_db", "values":{ "name":"default", "location":"hdfs://mycluster/apps/hive/warehouse", "description":"Default Hive database", "ownerType":2, "qualifiedName":"default@cl1", "owner":"public", "clusterName":"cl1", "parameters":{ } }, "traitNames":[ ], "traits":{ } }, "retention":0, "qualifiedName":"default.table3@cl1", "columns":[ { "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference", "id":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "id":"-11893021824425514", "version":0, "typeName":"hive_column", "state":"ACTIVE" }, "typeName":"hive_column", "values":{ "name":"abc", "qualifiedName":"default.table3.abc@cl1", "owner":"hive", "type":"string", "table":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "id":"-11893021824425516", "version":0, "typeName":"hive_table", "state":"ACTIVE" } }, "traitNames":[ ], "traits":{ } } ], "lastAccessTime":"2016-12-28T09:46:30.000Z", "owner":"hive", "sd":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference", "id":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "id":"-11893021824425515", "version":0, "typeName":"hive_storagedesc", "state":"ACTIVE" }, "typeName":"hive_storagedesc", "values":{ "location":"hdfs://mycluster/apps/hive/warehouse/table3", "serdeInfo":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Struct", "typeName":"hive_serde", "values":{ "serializationLib":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe", "parameters":{ "serialization.format":"1" } } }, "qualifiedName":"default.table3@cl1_storage", "outputFormat":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat", "compressed":false, "numBuckets":-1, "inputFormat":"org.apache.hadoop.mapred.TextInputFormat", "parameters":{ }, "storedAsSubDirectories":false, "table":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "id":"-11893021824425516", "version":0, "typeName":"hive_table", "state":"ACTIVE" } }, "traitNames":[ ], "traits":{ } }, "parameters":{ "rawDataSize":"0", "numFiles":"0", "transient_lastDdlTime":"1482918390", "totalSize":"0", "COLUMN_STATS_ACCURATE":"{\"BASIC_STATS\":\"true\"}", "numRows":"0" }, "partitionKeys":[ ] }, "traitNames":[ ], "traits":{ } } ], "endTime":"2016-12-28T09:46:31.211Z", "recentQueries":[ "create table table3 as select * from table2" ], "inputs":[ { "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference", "id":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "id":"-11893021824425520", "version":0, "typeName":"hive_table", "state":"ACTIVE" }, "typeName":"hive_table", "values":{ "tableType":"MANAGED_TABLE", "name":"table2", "createTime":"2016-12-28T09:34:53.000Z", "temporary":false, "db":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference", "id":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "id":"-11893021824425521", "version":0, "typeName":"hive_db", "state":"ACTIVE" }, "typeName":"hive_db", "values":{ "name":"default", "location":"hdfs://mycluster/apps/hive/warehouse", "description":"Default Hive database", "ownerType":2, "qualifiedName":"default@cl1", "owner":"public", "clusterName":"cl1", "parameters":{ } }, "traitNames":[ ], "traits":{ } }, "retention":0, "qualifiedName":"default.table2@cl1", "columns":[ { "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference", "id":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "id":"-11893021824425518", "version":0, "typeName":"hive_column", "state":"ACTIVE" }, "typeName":"hive_column", "values":{ "name":"abc", "qualifiedName":"default.table2.abc@cl1", "owner":"hive", "type":"string", "table":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "id":"-11893021824425520", "version":0, "typeName":"hive_table", "state":"ACTIVE" } }, "traitNames":[ ], "traits":{ } } ], "lastAccessTime":"2016-12-28T09:34:53.000Z", "owner":"hive", "sd":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference", "id":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "id":"-11893021824425519", "version":0, "typeName":"hive_storagedesc", "state":"ACTIVE" }, "typeName":"hive_storagedesc", "values":{ "location":"hdfs://mycluster/apps/hive/warehouse/table2", "serdeInfo":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Struct", "typeName":"hive_serde", "values":{ "serializationLib":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe", "parameters":{ "serialization.format":"1" } } }, "qualifiedName":"default.table2@cl1_storage", "outputFormat":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat", "compressed":false, "numBuckets":-1, "inputFormat":"org.apache.hadoop.mapred.TextInputFormat", "parameters":{ }, "storedAsSubDirectories":false, "table":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "id":"-11893021824425520", "version":0, "typeName":"hive_table", "state":"ACTIVE" } }, "traitNames":[ ], "traits":{ } }, "parameters":{ "rawDataSize":"0", "numFiles":"0", "transient_lastDdlTime":"1482917693", "totalSize":"0", "COLUMN_STATS_ACCURATE":"{\"BASIC_STATS\":\"true\"}", "numRows":"0" }, "partitionKeys":[ ] }, "traitNames":[ ], "traits":{ } } ], "qualifiedName":"default.table3@cl1:1482918390000", "queryText":"create table table3 as select * from table2", "clusterName":"cl1", "userName":"hive" }, "traitNames":[ ], "traits":{ } }]
You have to change multiple properties, basically there is a input JSON block that talks about the entity(hive table, say table2) and output JSON block that talks about the entity(hive table say table3) which acts as input and output to the process respectively. Hope this helps.
Created 12-29-2016 11:46 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As an extension to what was answered here, just create another table named table3 and submit the below json using /api/atlas/entities REST API.
[{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference", "id":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "id":"-11893021824425513", "version":0, "typeName":"hive_process", "state":"ACTIVE" }, "typeName":"hive_process", "values":{ "queryId":"hive_20161228094619_81b13647-4f7f-4f1b-9c08-0f64eb8dbb34", "name":"create table table3 as select * from table2", "startTime":"2016-12-28T09:46:19.003Z", "queryPlan":{ }, "operationType":"CREATETABLE_AS_SELECT", "outputs":[ { "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference", "id":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "id":"-11893021824425516", "version":0, "typeName":"hive_table", "state":"ACTIVE" }, "typeName":"hive_table", "values":{ "tableType":"MANAGED_TABLE", "name":"table3", "createTime":"2016-12-28T09:46:30.000Z", "temporary":false, "db":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference", "id":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "id":"-11893021824425517", "version":0, "typeName":"hive_db", "state":"ACTIVE" }, "typeName":"hive_db", "values":{ "name":"default", "location":"hdfs://mycluster/apps/hive/warehouse", "description":"Default Hive database", "ownerType":2, "qualifiedName":"default@cl1", "owner":"public", "clusterName":"cl1", "parameters":{ } }, "traitNames":[ ], "traits":{ } }, "retention":0, "qualifiedName":"default.table3@cl1", "columns":[ { "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference", "id":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "id":"-11893021824425514", "version":0, "typeName":"hive_column", "state":"ACTIVE" }, "typeName":"hive_column", "values":{ "name":"abc", "qualifiedName":"default.table3.abc@cl1", "owner":"hive", "type":"string", "table":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "id":"-11893021824425516", "version":0, "typeName":"hive_table", "state":"ACTIVE" } }, "traitNames":[ ], "traits":{ } } ], "lastAccessTime":"2016-12-28T09:46:30.000Z", "owner":"hive", "sd":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference", "id":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "id":"-11893021824425515", "version":0, "typeName":"hive_storagedesc", "state":"ACTIVE" }, "typeName":"hive_storagedesc", "values":{ "location":"hdfs://mycluster/apps/hive/warehouse/table3", "serdeInfo":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Struct", "typeName":"hive_serde", "values":{ "serializationLib":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe", "parameters":{ "serialization.format":"1" } } }, "qualifiedName":"default.table3@cl1_storage", "outputFormat":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat", "compressed":false, "numBuckets":-1, "inputFormat":"org.apache.hadoop.mapred.TextInputFormat", "parameters":{ }, "storedAsSubDirectories":false, "table":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "id":"-11893021824425516", "version":0, "typeName":"hive_table", "state":"ACTIVE" } }, "traitNames":[ ], "traits":{ } }, "parameters":{ "rawDataSize":"0", "numFiles":"0", "transient_lastDdlTime":"1482918390", "totalSize":"0", "COLUMN_STATS_ACCURATE":"{\"BASIC_STATS\":\"true\"}", "numRows":"0" }, "partitionKeys":[ ] }, "traitNames":[ ], "traits":{ } } ], "endTime":"2016-12-28T09:46:31.211Z", "recentQueries":[ "create table table3 as select * from table2" ], "inputs":[ { "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference", "id":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "id":"-11893021824425520", "version":0, "typeName":"hive_table", "state":"ACTIVE" }, "typeName":"hive_table", "values":{ "tableType":"MANAGED_TABLE", "name":"table2", "createTime":"2016-12-28T09:34:53.000Z", "temporary":false, "db":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference", "id":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "id":"-11893021824425521", "version":0, "typeName":"hive_db", "state":"ACTIVE" }, "typeName":"hive_db", "values":{ "name":"default", "location":"hdfs://mycluster/apps/hive/warehouse", "description":"Default Hive database", "ownerType":2, "qualifiedName":"default@cl1", "owner":"public", "clusterName":"cl1", "parameters":{ } }, "traitNames":[ ], "traits":{ } }, "retention":0, "qualifiedName":"default.table2@cl1", "columns":[ { "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference", "id":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "id":"-11893021824425518", "version":0, "typeName":"hive_column", "state":"ACTIVE" }, "typeName":"hive_column", "values":{ "name":"abc", "qualifiedName":"default.table2.abc@cl1", "owner":"hive", "type":"string", "table":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "id":"-11893021824425520", "version":0, "typeName":"hive_table", "state":"ACTIVE" } }, "traitNames":[ ], "traits":{ } } ], "lastAccessTime":"2016-12-28T09:34:53.000Z", "owner":"hive", "sd":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference", "id":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "id":"-11893021824425519", "version":0, "typeName":"hive_storagedesc", "state":"ACTIVE" }, "typeName":"hive_storagedesc", "values":{ "location":"hdfs://mycluster/apps/hive/warehouse/table2", "serdeInfo":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Struct", "typeName":"hive_serde", "values":{ "serializationLib":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe", "parameters":{ "serialization.format":"1" } } }, "qualifiedName":"default.table2@cl1_storage", "outputFormat":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat", "compressed":false, "numBuckets":-1, "inputFormat":"org.apache.hadoop.mapred.TextInputFormat", "parameters":{ }, "storedAsSubDirectories":false, "table":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "id":"-11893021824425520", "version":0, "typeName":"hive_table", "state":"ACTIVE" } }, "traitNames":[ ], "traits":{ } }, "parameters":{ "rawDataSize":"0", "numFiles":"0", "transient_lastDdlTime":"1482917693", "totalSize":"0", "COLUMN_STATS_ACCURATE":"{\"BASIC_STATS\":\"true\"}", "numRows":"0" }, "partitionKeys":[ ] }, "traitNames":[ ], "traits":{ } } ], "qualifiedName":"default.table3@cl1:1482918390000", "queryText":"create table table3 as select * from table2", "clusterName":"cl1", "userName":"hive" }, "traitNames":[ ], "traits":{ } }]
You have to change multiple properties, basically there is a input JSON block that talks about the entity(hive table, say table2) and output JSON block that talks about the entity(hive table say table3) which acts as input and output to the process respectively. Hope this helps.
Created 12-29-2016 12:27 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks to see you gain Ayub,
Could you please post what changes you have made in above json?
did you change guid somewhere to link to dataset or something else?
Created 12-29-2016 12:33 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Nope, GUIDs here are just negative large numbers. Entities(hive tables, process) are identified by their qualified name and when the JSON is saved to the backend datastore, it will be stored with the actual GUIDs of entities(hive tables and hive process). Attaching diff.txt of two processes JSON, this should give you the list of changes. Let me know if you have any queries
Created 01-03-2017 05:58 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you Ayub,
it's working fine for me,
I have one query here,
I have one table table5 has columns id, name and age, After inserting the table5 metadata into Atlas I am getting the repeated column I.e. name only. however not getting metadata for Id and age column.
Please find here the below table5 JSON & let me know if any mistake is there in JSON.
Please find attached Atlas image I am getting output like shown in image.
[ { "traits":{ }, "traitNames":[ ], "values":{ "ownerType":2, "owner":"root", "qualifiedName":"default@Sandbox", "clusterName":"Sandbox", "name":"default", "description":"emr hive database", "location":"hdfs:\/\/sandbox.hortonworks.com:8020\/apps\/hive\/\/warehouse", "parameters":{ } }, "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference", "typeName":"hive_db", "id":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "typeName":"hive_db", "id":"-11893021824425525", "state":"ACTIVE", "version":0 } }, { "traits":{ }, "traitNames":[ ], "values":{ "owner":"root", "temporary":false, "lastAccessTime":"2017-01-03T11:02:53.000Z", "qualifiedName":"default.table5@Sandbox", "columns":[ { "traits":{ }, "traitNames":[ ], "values":{ "owner":"root", "qualifiedName":"default.table5.name@Sandbox", "name":"name", "type":"string", "table":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "typeName":"hive_table", "id":"-11893021824425524", "state":"ACTIVE", "version":0 } }, "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference", "typeName":"hive_column", "id":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "typeName":"hive_column", "id":"-11893021824425522", "state":"ACTIVE", "version":0 } }, { "traits":{ }, "traitNames":[ ], "values":{ "owner":"root", "qualifiedName":"default.table5.id@Sandbox", "name":"id", "type":"int", "table":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "typeName":"hive_table", "id":"-11893021824425524", "state":"ACTIVE", "version":0 } }, "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference", "typeName":"hive_column", "id":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "typeName":"hive_column", "id":"-11893021824425522", "state":"ACTIVE", "version":0 } }, { "traits":{ }, "traitNames":[ ], "values":{ "owner":"root", "qualifiedName":"default.table5.age@Sandbox", "name":"age", "type":"int", "table":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "typeName":"hive_table", "id":"-11893021824425524", "state":"ACTIVE", "version":0 } }, "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference", "typeName":"hive_column", "id":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "typeName":"hive_column", "id":"-11893021824425522", "state":"ACTIVE", "version":0 } } ], "tableType":"MANAGED_TABLE", "sd":{ "traits":{ }, "traitNames":[ ], "values":{ "qualifiedName":"default.table5@Sandbox_storage", "storedAsSubDirectories":false, "location":"hdfs:\/\/sandbox.hortonworks.com:8020\/apps\/hive\/warehouse\/table5", "compressed":false, "inputFormat":"org.apache.hadoop.mapred.TextInputFormat", "outputFormat":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat", "parameters":{ }, "serdeInfo":{ "values":{ "serializationLib":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe", "parameters":{ "serialization.format":"1" } }, "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Struct", "typeName":"hive_serde" }, "table":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "typeName":"hive_table", "id":"-11893021824425524", "state":"ACTIVE", "version":0 }, "numBuckets":-1 }, "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference", "typeName":"hive_storagedesc", "id":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "typeName":"hive_storagedesc", "id":"-11893021824425523", "state":"ACTIVE", "version":0 } }, "createTime":"2017-01-03T11:02:53.000Z", "name":"table5", "partitionKeys":[ ], "parameters":{ "totalSize":"0", "rawDataSize":"0", "numRows":"0", "COLUMN_STATS_ACCURATE":"{\"BASIC_STATS\":\"true\"}", "numFiles":"0", "transient_lastDdlTime":"1482917693" }, "db":{ "traits":{ }, "traitNames":[ ], "values":{ "ownerType":2, "owner":"root", "qualifiedName":"default@Sandbox", "clusterName":"Sandbox", "name":"default", "description":"emr hive database", "location":"hdfs:\/\/sandbox.hortonworks.com:8020\/apps\/hive\/\/warehouse", "parameters":{ } }, "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference", "typeName":"hive_db", "id":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "typeName":"hive_db", "id":"-11893021824425525", "state":"ACTIVE", "version":0 } }, "retention":0 }, "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference", "typeName":"hive_table", "id":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "typeName":"hive_table", "id":"-11893021824425524", "state":"ACTIVE", "version":0 } } ]
Created 01-03-2017 07:01 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Ayub,
The above issue is solved. Actually there was mistake in JSON.
Whenever we have multiple columns in a table we must have to provide different random GUID long number (not same for all the column) even though it's negative number only in such a case Apache atlas will able to different column names otherwise,will get same name for all columns in Apache Atlas UI.
To make this work I have just provided different random ID for each columns as follows:
Please find attached the correct JSON file :
Created 01-03-2017 07:30 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Great! I was about to share the same info.. Thanks for sharing the details.
Created 01-03-2017 10:53 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Ayub,
We have two table i.e. table4 and table5 having columns Id, Name, Age.
we have inserted entity metadata, lineage metadata of both table in Atlas and able to see the schema and lineage graph in Atlas.
After that I have deleted the entity metadata of table5 and reinserted entity metadata of table5.
Next I have inserted the same lineage metadata (earlier lineage JSON metadata) of both table in atlas; however not able to see the lineage graph of two table. getting below response message from atlas server.
{"requestId":"qtp662559856-30620 - 26dbb640-9629-4c29-b209-32331e52962e","entities":{}}
Please find here the below lineage JSON metadata and let me know the mistake I have done.
so after deleting table hive table entity and reinserting same metadata(i.e. hive table entity JSON data) we are unable to see the lineage in atlas ,so what could be the reason behind this?
what actually mistake I am making in lineage json second time because of which I am not getting lineage?
Do I need to change the value of process id or process name in JSON ?
Created 01-03-2017 11:59 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Could you please post this as a different question? As this might help other community members as well?
Created 01-03-2017 12:05 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Ayub,
Here is the link for above same question,