04-30-2018 06:55 AM
why to stop Namenode services to take backup of namenode metadata, instead of stoping namenode services we can put namenode in safe mode "ON" mode right so that cluster will get read-only state and take the backup using #tar -cvf backname metadatapath
i am using Cloudera distribution
in below url i found to stop servies and proceed to take the backup.. why not with safe mode option ?? please help me
04-30-2018 11:10 AM
You can perform the below without stopping the namenode .
hdfs dfsadmin -fetchImage
Moreover its good to have HA configured to avoid single point of failure.
I will provide a good link that talks more about the metadata Backup ,its really good blog.
04-30-2018 12:31 PM - edited 04-30-2018 12:37 PM
The link that you are referring is belongs to 5.4.x, please refer the below link (5.14.x) for little more details
There are two types of backup
1. HDFS Metadata backup
Need to follow all the steps including "Stop the cluster. It is particularly important that the NameNode role process is not running so that you can make a consistent backup"
2. NameNode Metadata backup
can be done using
$ hdfs dfsadmin -fetchImage backup_dir
Now to answer your question,
If you see the first link, it says "Cloudera recommends backing up HDFS metadata before a major upgrade". So In the real-time production cluster, we perform the HDFS metadata backup, major upgrade during the downtime. So the given steps are recommended way for consistent backup.
But if your situation is just a mater of namenode back-up in a regular interval, then I belive you are correct.. you can switch-on the safe mode and take a backup and leave the safe mode. (or) you can try the option from the 2nd link
Note: Please make sure to test it in lower environments before apply in prod
05-01-2018 07:02 AM
Thanks for your response
so Stop cluster means - it will stop all the components right(include bothe namenodes and other components like soop, hive, yarn etc....) ? if i am not wrong...
05-02-2018 04:44 AM
you can stop the cluster using CM -> Top left Cluster menu -> Stop
Yes, It will stop all the avilable services Ex: Hue, Hive, Spark, Flume, Yarn, HDFS, Zookeeper, etc. And it won't disturb your host & Cloudera Management Service.
Note: You don't need to separately handle daemons like namenode on this