I'm trying to map dependencies between a set of Datasets and a set of Processes in Apache Atlas.
The scenario can be described as follow.
input datasets: d1, d2, d3
output datasets: d4, d5
input datasets: d1, d4
output datasets: d6
input datasets: d6
output datasets: none
When looking at dataset d6 lineage I can clearly see that it is being generated by process p2, but I cannot see that it is also an input of process p3.
This seems to be related to the fact that process p3 does not have any output dataset.
But I would be able to visualize it in d6 lineage, because a relationship between d6 and p3 exists (d6 is an input of p3).
As per my understanding, a process is shown in lineage ONLY if it has at least an input AND an output.
IMHO, this condition is too strict. A process should be shown in a lineage graph whenever it has at least an input OR an output.
Does anyone know if this is a configurable option in Apache Atlas 1.1.0 or if there's a workaround?
Thank you very much,