Created 04-19-2021 01:12 AM
Hi,
I want to remove some old disk and added new disk with new mount points. Now the new disk are part of CDH. Now I want to transfer the blocks from old disk /data01 to new disk /data05 and then remove /data01 after move of all blocks.
Is there a way to do this ?
can I take datanode services down and then manually copy? will Cloudera able to find the new blocks on /data05 (new disk) ?
Created 04-19-2021 10:17 AM
Hi @Chetankumar
You can perform disk hot swap of DN.
https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/admin_dn_swap.html
If the Replication factor is set to 3 for all the files then taking down one disk shouldn't be a problem as Namenode will auto-replicate the under-replicated blocks. As part of small test first stop the datanode and wait for sometime (While NN copies the blocks to other available datanodes).
Run fsck to confirm if HDFS file system is healthy. When it is healthy, you can easily play around with that stopped datanode. Idea is to ensure the replication factor to 3 so that you dont incur any dataloss.
if the Replication factor is set to 1 for some files and if those blocks are hosted on that /data01 disk. Then it could be a potential loss. As long as you have RF=3 you would be good.
Does that answer your questions ? Let us know
Regards,
Created 04-20-2021 06:26 AM
Hi @kingpin
Thanks for your reply. Yes this solution worked and now we are able to remove the old disk and after rebalance new disks are in use. We will continue to remove the disk from each node after the rebalance complete after each attempt.
Created 04-19-2021 10:17 AM
Hi @Chetankumar
You can perform disk hot swap of DN.
https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/admin_dn_swap.html
If the Replication factor is set to 3 for all the files then taking down one disk shouldn't be a problem as Namenode will auto-replicate the under-replicated blocks. As part of small test first stop the datanode and wait for sometime (While NN copies the blocks to other available datanodes).
Run fsck to confirm if HDFS file system is healthy. When it is healthy, you can easily play around with that stopped datanode. Idea is to ensure the replication factor to 3 so that you dont incur any dataloss.
if the Replication factor is set to 1 for some files and if those blocks are hosted on that /data01 disk. Then it could be a potential loss. As long as you have RF=3 you would be good.
Does that answer your questions ? Let us know
Regards,
Created 04-20-2021 06:26 AM
Hi @kingpin
Thanks for your reply. Yes this solution worked and now we are able to remove the old disk and after rebalance new disks are in use. We will continue to remove the disk from each node after the rebalance complete after each attempt.