Support Questions

Find answers, ask questions, and share your expertise

HDFS cluster rebalancing algorithm

avatar
Rising Star

Hi all, my team are currently testing on HDFS rebalancing in a HDFS cluster with two DataNode. I have read the documentation https://docs.cloudera.com/cdp-private-cloud-base/7.1.6/scaling-namespaces/topics/hdfs-step-1-storage...

We have a few question about how the file moving size is calculated.

1. How is cluster average utilization calculated. Do the calculation use Configured capacity or Present capacity?

BrianChan_0-1693817076802.png

2. How is individual utilization calculated. Do the calculation use configured capacity or the sum of DFS used and remaining?

BrianChan_1-1693817221547.png

3. When looking at the balancer log, it said need to move 4.98GB to make the cluster balanced. However, afterwards, the log show that the balancer decided to move 18.98GB from one host to another host. Why there is difference?

Balancer logBalancer log

4. We have set the DataNode balancing bandwidth to 200MiB, but as per observation, the bandwidth consumption can exceed 200MiB. Why did it happen?

Please help me with my questions. Thank you in advance.

 

 

 

 

1 REPLY 1

avatar
Master Collaborator

@BrianChan 

Cluster Average Utilization Calculation: The cluster average utilization during HDFS rebalancing is typically calculated based on the configured capacity of the cluster. The configured capacity represents the total storage capacity allocated to the HDFS cluster as defined in the cluster's configuration settings.

Individual Utilization Calculation: Individual utilization during HDFS rebalancing is usually calculated based on the sum of DFS used and remaining space for each datanode. This calculation provides an accurate representation of how much storage is currently being utilized on each datanode and how much space is available for additional data storage.

Difference in File Moving Size: The difference between the initially reported file moving size and the actual file moving size in the balancer log can occur due to various factors. These may include changes in data distribution across datanodes during the rebalancing process, optimizations performed by the balancer algorithm, or adjustments made based on real-time cluster conditions and performance considerations.

Exceeding DataNode Balancing Bandwidth: While the datanode balancing bandwidth is configured to limit the amount of data transferred between datanodes per second during HDFS rebalancing, it's possible for the actual bandwidth consumption to exceed this limit under certain circumstances. Factors such as network congestion, variations in data transfer rates, or optimizations performed by the balancer algorithm can contribute to bandwidth consumption exceeding the configured limit.

 

Regards,

Chethan YM