Reply
Contributor
Posts: 41
Registered: ‎01-24-2016

Downsizing number of mounted disks on each datanode in a hadoop cluster

Hi guys

 

I have 4 datanodes in my cluster that have 6 X 1TB each. I wanted to downsize each datanode to 3 X 1TB. So essentially move 3 X 1TB per datanode. This is the process I folowed. Please tell me if its correct or not.

 

1. On a running cluster , go to DN1

2. Edit /etc/fstab . Remove the disk6 mountpoint and save.

3. Reboot the DN1

4. Login back to DN1 and do "hdfs fsck /"

5. Make sure of the following

    Over-replicated blocks: 0 (0.0 %)

    Under-replicated blocks: 0 (0.0 %)

    Mis-replicated blocks: 0 (0.0 %)  

    Corrupt blocks: 0

6. Repeat process 1.2.3.4.5 on DN2

7. Then remove disk5 from DN1, DN2, DN3, DN4 by following 1.2.3.4.5 for each datanode

7. Then remove disk4 from DN1, DN2, DN3, DN4 by following 1.2.3.4.5 for each datanode

8. All good till now 

9. Go to cloudera manager and change 

    dfs.datanode.failed.volumes.tolerated = 1 (from 3)

10. Modify dfs.data.dir, dfs.datanode.data.dir

      (remove the three disks you removed) 

11. Restart Hadoop cluster 

12. This is where I observed 24 blocks corrups or missing ? Why is this happening ? 

 

Please advise a better process that will result in 0 corrupt/missing blocks

 

warmly

 

sanjay

 

Announcements