- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
HDFS Data Recovery
- Labels:
-
Apache Hadoop
Created ‎11-18-2016 01:23 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What are the best recovery options if a product like Abinitio runs an m_rm command that deletes the HDFS data in one of the environments.
These type of low level executions by-pass the Hadoop dfs rm command that puts the deleted data in the trash folder for recover.
The Trash Interval is Configured for 21 Days in the Hortonworks Environment.
Data had to be recreated from the source files, but if this were prod what are the best recovery options?
Created ‎11-18-2016 01:42 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We can use protected directory feature for production to avoid accidental deletion of data from HDFS
Created ‎11-18-2016 01:42 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We can use protected directory feature for production to avoid accidental deletion of data from HDFS
Created ‎11-18-2016 02:19 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What about Ranger, can that provide protection at this level? Assuming data does get removed any recovery options?
Created ‎11-18-2016 02:41 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ranger will allow you to authorize centrally where you can allow/disallow access to user level for creating or deleting directories. but it does not provide recovery options.
Pls do check - http://hortonworks.com/apache/ranger/
