Support Questions

Find answers, ask questions, and share your expertise

Apache Atlas Tracking Lineage Not working as Expected

avatar
New Contributor

In Apache Atlas, I am trying to model the data flow of different processes. The issue I am having is that some of these processes share common DataSets but I don't necessarily want the different processes I am modeling to appear to be connected to each other.

For example, in this lineage model, I want to show that there is an input of an XML Data source file into a process that outputs and transferred to another computer.

{
"entity": {
"typeName": "datasystem_datatransfer",
"attributes": {
"id":"b75af137-9279-4c73-be9f-0e37b686dde5",
"qualifiedName": "b75af137-9279-4c73-be9f-0e37b686dde5@datasystem_datatransfer",
"displayName": "Data Transfer Use Case 1",
"inputs": [
{
"uniqueAttributes":{"qualifiedName": "25b60fe5-891c-4c94-87ab-b075d838ec30@datasystem_datasource"},
"typeName": "datasystem_datasource"
}
],
"outputs": [
{
"uniqueAttributes":{"qualifiedName": "21781e1b-4b94-435b-be0a-141776267c4e@datasystem_computer"},
"typeName": "datasystem_computer"
}
],
"description": "Data transfer from Data Source to Computer.",
"name": "dataEgressUseCase1"
}
}
}
This will create a model like this:

 

datasystem_datasource --> datasystem_datatransfer --> datasystem_computer

 

I now have another process I want to model where I am using the same "datasystem_computer" but the process is a bit more complicated:

{
"entities":[
{
"typeName": "datasystem_datatransfer",
"attributes": {
"id":"1305f6c4-f0da-4929-be21-dd0798dc2086",
"qualifiedName": "1305f6c4-f0da-4929-be21-dd0798dc2086@datasystem_datatransfer",
"displayName": "Data Transfer Use Case 2",
"inputs": [
{
"uniqueAttributes":{"qualifiedName": "c72375fb-34a5-4a22-895c-0d55435fdf26@datasystem_datasource "},
"typeName": "datasystem_datasource"
}
],
"outputs": [
{
"uniqueAttributes":{"qualifiedName": "21781e1b-4b94-435b-be0a-141776267c4e@datasystem_computer"},
"typeName": "datasystem_computer"
}
],
"description": "Data Transfer from Data Source to PC.",
"name": "dataEgressUseCase2"
}
},
{
"typeName": "datasystem_datatransfer",
"attributes": {
"id":"307e6f84-41af-482e-8641-39fa258e709d",
"qualifiedName": "307e6f84-41af-482e-8641-39fa258e709d@datasystem_datatransfer",
"displayName": "Data Transfer Use Case 2.5",
"inputs": [
{
"uniqueAttributes":{"qualifiedName": "21781e1b-4b94-435b-be0a-141776267c4e@datasystem_computer"},
"typeName": "datasystem_computer"
}
],
"outputs": [
{
"uniqueAttributes":{"qualifiedName": "5acddaca-6eb8-48f9-be75-fc757e442985@datasystem_datasource"},
"typeName": "datasystem_datasource"
}
],
"description": "Data Transfer from Data Source to PC to Another PC.",
"name": "dataEgressUseCase2.5"
}
}

]
}
This should create a lineage diagram like:

 

datasystem_datasource --> datasystem_datatransfer --> datasystem_computer --> datasystem_datatransfer -->datasystem_datasource 

 

The problem is that when I create this lineage, it changes the first lineage I created. They have different ID's so I am not sure why creating this second lineage would impact the first? I realize that they share the same datasystem_computer in one node, but they are different processes. What am I doing wrong?

0 REPLIES 0