we found that DataNode Volume Choosing Policy parameter not work as our expectation. our datanode had varrying storage size, due to data growth now the datanode that had storage size smaller than others always got bad health status. it seem like still using round robin configuration so the files that thrown to HDFS will be spread to all datanode. below are our configurations related with datanode
dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold = 10GB
is there any other configuration that related the HDFS block placing that i missed so DataNode Volume Choosing Policy not working properly ?
just wrong type to dfs.datanode.fsdataset.volume.choosing.policy=Round Robin the correct one is available space. :D
Same situation, what are the parameters to set for datanodes having different disk sizes.
one datanode has - disk space 7 TB
other datanode has - disk space 15TB
Any help is highly appreciated.