Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Best practices to join nodes back into the cluster after they went down due to an OS issue

avatar
Expert Contributor

Hi,

 

what are the best practices of the following scenario's

 

1. If a data node goes down what is the best procedure to add it back. once we have identified the issue and fixed it. 

 

Decommission the node 

Recommission it back 

Is there a need to format all data on the local file system on that node and then add it back ? 

 

 

 

 

1 ACCEPTED SOLUTION

avatar
Master Collaborator

If you decomissioned the node before then it better to recomission it.

 

No need to format any disk from the node if you had a HW issue, unless you replace a disk on the server, you should format the new disk.

 

If you didn't removed or decomissioned the node, so nothing you should do, the node will join the cluster.

View solution in original post

2 REPLIES 2

avatar
Master Collaborator

If you decomissioned the node before then it better to recomission it.

 

No need to format any disk from the node if you had a HW issue, unless you replace a disk on the server, you should format the new disk.

 

If you didn't removed or decomissioned the node, so nothing you should do, the node will join the cluster.

avatar
Contributor
After recomission can just add the datanode back and Name node will identify all the blocks that were previously present in this datanode. Once Namenode identifies this information, It will wipe out the third replica that it created during the datanode decomission.

You may have to run hdfs balancer if you format the disks and then recomision it to the cluster which is not a best practise.