Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

HDFS Data Recovery

Solved Go to solution
Highlighted

HDFS Data Recovery

Rising Star

What are the best recovery options if a product like Abinitio runs an m_rm command that deletes the HDFS data in one of the environments.

These type of low level executions by-pass the Hadoop dfs rm command that puts the deleted data in the trash folder for recover.

The Trash Interval is Configured for 21 Days in the Hortonworks Environment.

Data had to be recreated from the source files, but if this were prod what are the best recovery options?

1 ACCEPTED SOLUTION

Accepted Solutions

Re: HDFS Data Recovery

Super Guru
@Kirk Haslbeck

We can use protected directory feature for production to avoid accidental deletion of data from HDFS

https://issues.apache.org/jira/browse/HDFS-8983

3 REPLIES 3

Re: HDFS Data Recovery

Super Guru
@Kirk Haslbeck

We can use protected directory feature for production to avoid accidental deletion of data from HDFS

https://issues.apache.org/jira/browse/HDFS-8983

Re: HDFS Data Recovery

Rising Star

What about Ranger, can that provide protection at this level? Assuming data does get removed any recovery options?

Re: HDFS Data Recovery

@Kirk Haslbeck

Ranger will allow you to authorize centrally where you can allow/disallow access to user level for creating or deleting directories. but it does not provide recovery options.

Pls do check - http://hortonworks.com/apache/ranger/

Don't have an account?
Coming from Hortonworks? Activate your account here