Support Questions

Find answers, ask questions, and share your expertise

HDFS re balance from ambari isn't working ( HDP 2.6.4 + ambari 2.6.1 )

hi all

We have huge cluster with the following machines

3 kafka machines

3 journal node machine ( master machines )

180 data node machines

we have a problem that disks in mostof the data-node are not have the same equal used size


/dev/sdf                  3.6T  442G  3.2T  13% /grid/sdf
/dev/sdc                  3.6T  373G  3.3T  11% /grid/sdc
/dev/sde                  3.6T  480G  3.2T  14% /grid/sde
/dev/sdi                  3.6T   89M  3.6T   1% /grid/sdi
/dev/sdg                  3.6T   89M  3.6T   1% /grid/sdg
/dev/sdd                  3.6T  477G  3.2T  13% /grid/sdd
/dev/sdh                  3.6T   89M  3.6T   1% /grid/sdh
/dev/sdb                  3.6T  480G  3.2T  14% /grid/sdb

so we re-balanced the HDFS from the ambari GUI as the following

HDFS --> SERVICE Actions --> Re-balance HDFS


but after that we seen that all disks on all workers are the same used size

so re balanced not performed here

so we not understand if the re-balanced button should works and if not , then what could be the reasons ?

is it a bug? , or something in the HDFS configuration that need to verify ?


Cloudera Employee

@Michael Bronson

In HDFS 2.x provides a “balancer” utility to help balance the blocks across DataNodes in the cluster. But from HDFS 3.x onwards we have Disk level Balancer that rebalance data across multiple disks of a DataNode. It is useful to correct skewed data distribution often seen after adding or replacing disks. Disk Balancer can be enabled by setting dfs.disk.balancer.enabled to true in hdfs-site.xml. It can be invoked by running "hdfs diskbalancer”.


For more detail:

Please accept this answer if you found it helpful.

we cant for now upgrade the cluster to 3.0 , so for the current version - 2.6.4 , so how to re-balance the HDFS ? ( you said - “balancer” utility , how to use it and which tool is it ? / location ? )



@Michael Bronson

The balancer is a subcommand of hdfs see usage below

hdfs balancer
          [-threshold <threshold>]
          [-policy <policy>]
          [-exclude [-f <hosts-file> | <comma-separated list of hosts>]]
          [-include [-f <hosts-file> | <comma-separated list of hosts>]]
          [-idleiterations <idleiterations>]

Here is the link to the documentation

ok . let me some time to play with this

second can you help me with my last thred -


New Contributor

The balancer command is working between nodes, but is there any command to balance space between partitions in the same node ?

We were facing an issue with a partition was read only and once it was fixed the space was increasing but not balanced