Support Questions
Find answers, ask questions, and share your expertise

Apache Atlas update DELETED entity to ACTIVE?

Solved Go to solution

Apache Atlas update DELETED entity to ACTIVE?

New Contributor

I am struggling to find a solution for this issue. My team is creating spark/presto plugins to create new relationship/lineage metadata in Atlas. After awhile, the lineage becomes very cluttered with deleted entities, which remain in the UI. We therefore want to reuse entities. I'll give you an example:

  • User creates hive_table named fully.qualified.name
  • User updates table
  • User deletes table
  • User creates hive_table named fully.qualified.name

In the above scenario, Atlas is aware of two entities with the qualifiedName 'fully.qualified.name' with one in DELETED status and one in ACTIVE status. I want the second creation to simply reuse the DELETED entity and update its status to ACTIVE along with any other attribute updates. The audit will now show its full lifecycle, including deletes. We can find the deleted entity with a dsl search and run a POST with that entity's guid, but nothing happens. I can't find any way to update properties as opposed to attributes. Is this possible? Are we going about this the wrong way? After finding no info regarding what seems to be a simple operation, I'm inclined to believe I'm doing something wrong. Thanks for any help!

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Apache Atlas update DELETED entity to ACTIVE?

Expert Contributor
@Gray Pickney

Filtering in lineage graph is not present currently. There has been a lot of ask about this feature being included. We will work towards including this in our next atlas release.

View solution in original post

8 REPLIES 8

Re: Apache Atlas update DELETED entity to ACTIVE?

Expert Contributor

There are few issues with reusing the same entities after delete:

1. If the previously deleted entity has some tags - say PII associated with them, does the new table created inherit these tags?

2. If the new table is created with additional or lesser number of columns than the original, reusing tables will not help.

If lineage cluttering is the issue, we can work towards adding filtering in lineage graph like exclude deleted entities from rendering.

updating entities status from DELETED to ACTIVE doesn't help as only entity attribute updates are honored, but status, createTime, updateTime are all treated as system attributes of the entity.

Re: Apache Atlas update DELETED entity to ACTIVE?

New Contributor

Is there a way to stop lineage showing in UI for the same table that is being refreshed daily

Re: Apache Atlas update DELETED entity to ACTIVE?

Expert Contributor

currently there is no way to disable lineage rendering in UI for any table.

Re: Apache Atlas update DELETED entity to ACTIVE?

New Contributor

I see what you're saying about the reuse. Lineage clutter is definitely the main issue we're facing. Regardless, we've tried modifying the queries responsible for displaying the lineage graph to no avail. Are you saying that feature would need to be developed? Are there any features in place that could help with this?

Re: Apache Atlas update DELETED entity to ACTIVE?

Expert Contributor
@Gray Pickney

Filtering in lineage graph is not present currently. There has been a lot of ask about this feature being included. We will work towards including this in our next atlas release.

View solution in original post

Re: Apache Atlas update DELETED entity to ACTIVE?

New Contributor

Thanks for the information. Do you have any suggestions as to an alternative for now?

Re: Apache Atlas update DELETED entity to ACTIVE?

New Contributor

Thanks Sarath.

Is there any sample api to fetch the metadata table from TeraData/Oracle and show it in Atlas. ? Is Sqoop the only possiblity to bring the metadata from RDBMS or any other alteranative .

Re: Apache Atlas update DELETED entity to ACTIVE?

Expert Contributor

@Karthikeyan Arjunan, sqoop hook is the way to import RDBMS data into Atlas. Currently we don't have hooks specific to RDBMS db like Oracle or Teradata