- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
What is the best approach/tool for point in time data recovery with HDP2.4 platform? Does WANdisco support point in time recovery?
- Labels:
-
Hortonworks Data Platform (HDP)
Created ‎09-30-2016 12:04 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎09-30-2016 05:13 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Snapshots do not create extra copies of blocks on the file system. Snapshots are stored along with the NameNode’s file system namespace. What do you mean by "huge size of snapshots and restoring the backups"? The entire point of snapshot is to not create extra copies of blocks on the file system and restore to a point in time a specific file or all.
a) There are always many ways to skin a cat, but what test did you do with the HDFS snapshot and failed you? Could you elaborate a little. That would help.
b) "Point in Time Recovery" - question for WANDisco. We endorse HDFS snapshot first for its function. WanDisco or other tool is your option.
Created ‎09-30-2016 04:50 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎09-30-2016 10:32 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@mqureshi Thank you for your response. Yes, HDFS snapshot is one of the option for point in time recovery. However, it seems there are many implications of using .snapshots particular the huge size of .snapshots and complexity of restoring the backups.
My question is in two folds -
a) Is HDFS snapshot is only way/approach to point in time recovery or there are other approaches?
b) Does WANdisco Fusion (DR product, endorsed by Hortonworks) provide point in time recovery?
Many thanks.
Created ‎09-30-2016 05:13 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Snapshots do not create extra copies of blocks on the file system. Snapshots are stored along with the NameNode’s file system namespace. What do you mean by "huge size of snapshots and restoring the backups"? The entire point of snapshot is to not create extra copies of blocks on the file system and restore to a point in time a specific file or all.
a) There are always many ways to skin a cat, but what test did you do with the HDFS snapshot and failed you? Could you elaborate a little. That would help.
b) "Point in Time Recovery" - question for WANDisco. We endorse HDFS snapshot first for its function. WanDisco or other tool is your option.
Created ‎10-05-2016 08:40 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Single tool solution is desirable, but it also comes with a price tag. Look at the link above. You can use a combination of HDFS snapshot and your standard database point in time recovery methods for database used for the metadata. You can leverage that practice and avoid extra-cost for something that is really not Hadoop specific.
If any response from this thread helped, please vote/accept best answer.
Created ‎10-03-2016 08:36 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We have option of either using HDFS snapshots or using WANdisco tool for designing point in Time Recovery for cluster. However, we wanted to go with approach/tool which covers backup of hadoop meta-store and configuration files in addition of backing up blocks on data nodes.
Look forward to your expertise advice on this.
Thanks.
