I have 4 datanodes in my cluster that have 6 X 1TB each. I wanted to downsize each datanode to 3 X 1TB. So essentially move 3 X 1TB per datanode. This is the process I folowed. Please tell me if its correct or not.
1. On a running cluster , go to DN1
2. Edit /etc/fstab . Remove the disk6 mountpoint and save.
3. Reboot the DN1
4. Login back to DN1 and do "hdfs fsck /"
5. Make sure of the following
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Corrupt blocks: 0
6. Repeat process 22.214.171.124.5 on DN2
7. Then remove disk5 from DN1, DN2, DN3, DN4 by following 126.96.36.199.5 for each datanode
7. Then remove disk4 from DN1, DN2, DN3, DN4 by following 188.8.131.52.5 for each datanode
8. All good till now
9. Go to cloudera manager and change
dfs.datanode.failed.volumes.tolerated = 1 (from 3)
10. Modify dfs.data.dir, dfs.datanode.data.dir
(remove the three disks you removed)
11. Restart Hadoop cluster
12. This is where I observed 24 blocks corrups or missing ? Why is this happening ?
Please advise a better process that will result in 0 corrupt/missing blocks