Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

Hdfs metadata backup strategy

New Contributor
Hi,
We have active name node and standby name node are running, Could you please explain what is the best practice to maintain Hdfs meta data backup strategy by considering consistent and no service loss and data loss. can we take backup while running the services and if yes could you please summarize?
3 REPLIES 3

Champion

You could write a python or shell script to backup your fsimage on a regular basis put it in cron tab to automate . 

Yes you can take a copy of your fsimage without interrupting the service . 

New Contributor

Hi Csguna,

Thanks for your help, Please correcct me if my understanding is correct or not

 

When we say we can copy fsimage without interrupting the service via below command

 

hdfs dfsadmin -fetchImage backup_dir

 

 The backup produced with above command will be consistent as while starting up name node,NameNode process reads the fsimage file and loads it to memory and also it applies if any edits present in journal nodes newer than the fsimage.In other case if jornal nodes are not available the its possible to lose data/changes occured in the interim.

 

Is there any best practices like in what frequency we can take metadata backup for prod hadoop clusters.

 

Could you please help me what will happenn in below scenario,

"When Standby Namenode is down for longer duaration , who does checkpointing operations", Is there any cases of loosing data or inconsistency if Active name node also crashes during this time?

New Contributor

Hi

can you help me out how to write a shell script to backup fsimage.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.