Created 11-24-2015 03:30 PM
I'm looking into the possibility of performing an 'online' backup of HDFS metadata without having to take down HDFS or NameNodes and wanted to find out if the following plan is doable or not.
General assumptions:
The understanding of how the Name Nodes maintain the namespace, in short, is:
The understanding is that both NN write fsimages to disk in the following sequence:
The above means that:
Based on the above, a proposed, simple procedure what won’t affect the availability of NN is as follows:
Are there any issues or potential pitfalls with this approach that anyone can see?
Created 11-24-2015 06:43 PM
The proposal looks basically sound. Here are a few other factors to consider.
Created 11-24-2015 06:43 PM
The proposal looks basically sound. Here are a few other factors to consider.
Created 01-27-2016 06:09 PM
You do realize that what you are trying to do is a poor man's HA. I understand that you might have business continuity requirements and cannot bring down Namenode, but just wanted to flag it.
Created 11-21-2016 09:42 PM
@Kent Baxley: We have incorporated this and related into into a new chapter in the HDFS Administration Guide, called Backing Up HDFS Metadata. See http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.0/bk_hdfs-administration/content/back_up_hdfs_...
Created 04-24-2017 04:44 PM
Hi @Kent Baxley, Looks like the doc is missing the plan for backing up the VERSION file.