Support Questions

Find answers, ask questions, and share your expertise

​Please how do i do a hot disk swap on a node without restarting the HDP 2.4 cluster?

avatar
Expert Contributor
 
1 ACCEPTED SOLUTION

avatar
Guru

You don't need to restart HDP 2.4 cluster. But it is recommended to decommission the node with dead disk, change the disk and add it back to the cluster. This will ensure that data is evenly distributed on all the data disks on this node.

1. To decommission, go to ambari -> Host -> Datanode. That has an option to decommission.

2. Go to nodemanager to decommission it as well.

3. Once it goes to decommissioned state, stop Datanode and nodemanager on that host and replace the disk.

4. Start datanode and nodemanager back.

5. You will see an option to recommission at the same place. You can click on it to take it out of decommissioned state.

No other services across the cluster need to be stopped and if you have more than 3 datanodes and your default rep. factor is 3, all services will continue.

View solution in original post

2 REPLIES 2

avatar
Guru

You don't need to restart HDP 2.4 cluster. But it is recommended to decommission the node with dead disk, change the disk and add it back to the cluster. This will ensure that data is evenly distributed on all the data disks on this node.

1. To decommission, go to ambari -> Host -> Datanode. That has an option to decommission.

2. Go to nodemanager to decommission it as well.

3. Once it goes to decommissioned state, stop Datanode and nodemanager on that host and replace the disk.

4. Start datanode and nodemanager back.

5. You will see an option to recommission at the same place. You can click on it to take it out of decommissioned state.

No other services across the cluster need to be stopped and if you have more than 3 datanodes and your default rep. factor is 3, all services will continue.

avatar
Master Guru

Or if you want to do it in a dirty way. ( Ravi's way is obviously cleaner ). Replace the drive make sure it works and restart the datanode. Same effect but the datanode will be out of the system for a shorter period of time.