Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Managing hard deleted records in Hadoop Datalake

Managing hard deleted records in Hadoop Datalake

New Contributor

Hello Guys,


I need to know what pattern you follow in your hadoop datalake once some records are hard deleted from the source itself.

Problem which i am facing as of now - Source hard deletes or archives some of the data from its systems.

if we are ingesting same data source, and if we don't manage hard deletes, reports made on the basis of data lake is quite different as compare to reports made on the source directly. Business does not want this ambiguity.

Please advise Best practise/approach to manage this in datalake. 




Don't have an account?
Coming from Hortonworks? Activate your account here