I have hadoop cluster with 6 nodes for HDFS.
Each of nodes has 4 disk (different capacity) with 1 partition for HDFS
<code>$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 219G 40G 168G 20% /
/dev/sdb1 1.8T 1.2T 665G 64% /data3
/dev/sdd1 3.6T 1.2T 2.5T 32% /data4
/dev/sdc1 1.8T 1.2T 671G 64% /data2
/dev/sde1 5.5T 1.2T 4.1T 22% /data1
Space on partitions are equally being used, but due to different size of disks, space on partitions /dev/sdb1 and /dev/sdc1 will finish a way quicker than on other 2 partitions.
The question is, what happens is we run out of space on those smaller partitions?
Thanks a lot for reply! Robert
You are right you cannot really predict which partition will finish quicker. But that is fine. The short answer is option 2 i.e hdfs will keep using partitions with free space.
Also if a disk gets filled it's not considered as failed. So dfs.datanode.failed.volumes.tolerated will not kick in.
Just make sure the volumes where you are writing logs have enough space and have the log rotation configs in place available via ambari.