Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Apache Atlas Tracking Lineage Not working as Expected

avatar
New Contributor

In Apache Atlas, I am trying to model the data flow of different processes. The issue I am having is that some of these processes share common DataSets but I don't necessarily want the different processes I am modeling to appear to be connected to each other.

For example, in this lineage model, I want to show that there is an input of an XML Data source file into a process that outputs and transferred to another computer.

{
"entity": {
"typeName": "datasystem_datatransfer",
"attributes": {
"id":"b75af137-9279-4c73-be9f-0e37b686dde5",
"qualifiedName": "b75af137-9279-4c73-be9f-0e37b686dde5@datasystem_datatransfer",
"displayName": "Data Transfer Use Case 1",
"inputs": [
{
"uniqueAttributes":{"qualifiedName": "25b60fe5-891c-4c94-87ab-b075d838ec30@datasystem_datasource"},
"typeName": "datasystem_datasource"
}
],
"outputs": [
{
"uniqueAttributes":{"qualifiedName": "21781e1b-4b94-435b-be0a-141776267c4e@datasystem_computer"},
"typeName": "datasystem_computer"
}
],
"description": "Data transfer from Data Source to Computer.",
"name": "dataEgressUseCase1"
}
}
}
This will create a model like this:

 

datasystem_datasource --> datasystem_datatransfer --> datasystem_computer

 

I now have another process I want to model where I am using the same "datasystem_computer" but the process is a bit more complicated:

{
"entities":[
{
"typeName": "datasystem_datatransfer",
"attributes": {
"id":"1305f6c4-f0da-4929-be21-dd0798dc2086",
"qualifiedName": "1305f6c4-f0da-4929-be21-dd0798dc2086@datasystem_datatransfer",
"displayName": "Data Transfer Use Case 2",
"inputs": [
{
"uniqueAttributes":{"qualifiedName": "c72375fb-34a5-4a22-895c-0d55435fdf26@datasystem_datasource "},
"typeName": "datasystem_datasource"
}
],
"outputs": [
{
"uniqueAttributes":{"qualifiedName": "21781e1b-4b94-435b-be0a-141776267c4e@datasystem_computer"},
"typeName": "datasystem_computer"
}
],
"description": "Data Transfer from Data Source to PC.",
"name": "dataEgressUseCase2"
}
},
{
"typeName": "datasystem_datatransfer",
"attributes": {
"id":"307e6f84-41af-482e-8641-39fa258e709d",
"qualifiedName": "307e6f84-41af-482e-8641-39fa258e709d@datasystem_datatransfer",
"displayName": "Data Transfer Use Case 2.5",
"inputs": [
{
"uniqueAttributes":{"qualifiedName": "21781e1b-4b94-435b-be0a-141776267c4e@datasystem_computer"},
"typeName": "datasystem_computer"
}
],
"outputs": [
{
"uniqueAttributes":{"qualifiedName": "5acddaca-6eb8-48f9-be75-fc757e442985@datasystem_datasource"},
"typeName": "datasystem_datasource"
}
],
"description": "Data Transfer from Data Source to PC to Another PC.",
"name": "dataEgressUseCase2.5"
}
}

]
}
This should create a lineage diagram like:

 

datasystem_datasource --> datasystem_datatransfer --> datasystem_computer --> datasystem_datatransfer -->datasystem_datasource 

 

The problem is that when I create this lineage, it changes the first lineage I created. They have different ID's so I am not sure why creating this second lineage would impact the first? I realize that they share the same datasystem_computer in one node, but they are different processes. What am I doing wrong?

0 REPLIES 0