Support Questions
Find answers, ask questions, and share your expertise
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

how to backup data before OS upgrade

We need to upgrade the OS from SUSE 11 to SUSE12 and also need to do HDP upgrade. Please let me know how to do the backup of data store in HDFS and also the procedure for the same.


Super Collaborator

do you have any backup capability to store your hdfs content? In that case you can use an edge node to backup the data. But maybe you should consider to upgrade node by node.
1. decommision the data node (all your files should still be available in hdfs)
2. upgrade the OS of that node
3. recommision the data node again

If you start with just one node, you can make sure every config change required on the OS is ok before doing the same on the other nodes. It should be possible to upgrade the cluster without a downtime.

I would separate OS and HDP upgrade.

Thanks Harald ! We have a cluster of 38 nodes. So for that we need to decommission nodes one by one ??. can you please share any document or link for step by step process.


Super Collaborator

this depends on the way you are managing your cluster. I.e. if you are using Ambari, you can do everything from the GUI:
You login to Ambari, than click 'Hosts' in the main menu (normally on the upper right side of the webpage). You get a list of your nodes in the cluster, with the HDP version and the number of components installed. Select the node which you like to upgrade, click on it and you can decommission each component.

Here is some more information (with screenshots, but it just about upgrading the physical disks, so they just decommission hdfs and remount the disks and recommission hdfs again):

For upgrading the OS you will have to decommission all services on the node. Be careful when there is something like hfds NameNode or Zookeeper Server or HBase Master, make sure you still have enough nodes of those types available (and not just one or the one you are about to decommission).

I would consider hdfs DataNode, Yarn NodeManager and Hbase RegionServer ok to decommision, as well as Ambari metrics collector or log feeder. You'll have to wait for the decommissioning to succeed, as by then all active jobs should be finished an new jobs don't get started on the decommissioned node.When this is done, you can upgrade the OS.

If you are not using Ambari to manage/provision your cluster, let me know what you are actually using.