Support Questions

Find answers, ask questions, and share your expertise

HDFS re balance from ambari isn't working ( HDP 2.6.4 + ambari 2.6.1 )

hi all

We have huge cluster with the following machines

3 kafka machines

3 journal node machine ( master machines )

180 data node machines

we have a problem that disks in mostof the data-node are not have the same equal used size

example:

/dev/sdf                  3.6T  442G  3.2T  13% /grid/sdf
/dev/sdc                  3.6T  373G  3.3T  11% /grid/sdc
/dev/sde                  3.6T  480G  3.2T  14% /grid/sde
/dev/sdi                  3.6T   89M  3.6T   1% /grid/sdi
/dev/sdg                  3.6T   89M  3.6T   1% /grid/sdg
/dev/sdd                  3.6T  477G  3.2T  13% /grid/sdd
/dev/sdh                  3.6T   89M  3.6T   1% /grid/sdh
/dev/sdb                  3.6T  480G  3.2T  14% /grid/sdb

so we re-balanced the HDFS from the ambari GUI as the following

HDFS --> SERVICE Actions --> Re-balance HDFS

97500-capture.png

but after that we seen that all disks on all workers are the same used size

so re balanced not performed here

so we not understand if the re-balanced button should works and if not , then what could be the reasons ?

is it a bug? , or something in the HDFS configuration that need to verify ?

Michael-Bronson
5 REPLIES 5

Cloudera Employee

@Michael Bronson

In HDFS 2.x provides a “balancer” utility to help balance the blocks across DataNodes in the cluster. But from HDFS 3.x onwards we have Disk level Balancer that rebalance data across multiple disks of a DataNode. It is useful to correct skewed data distribution often seen after adding or replacing disks. Disk Balancer can be enabled by setting dfs.disk.balancer.enabled to true in hdfs-site.xml. It can be invoked by running "hdfs diskbalancer”.

JIRA: https://issues.apache.org/jira/browse/HDFS-1312

For more detail: https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-hdfs/HDFSDiskbalancer.html

Please accept this answer if you found it helpful.

we cant for now upgrade the cluster to 3.0 , so for the current version - 2.6.4 , so how to re-balance the HDFS ? ( you said - “balancer” utility , how to use it and which tool is it ? / location ? )

Michael-Bronson

Mentor

@Michael Bronson

The balancer is a subcommand of hdfs see usage below

hdfs balancer
          [-threshold <threshold>]
          [-policy <policy>]
          [-exclude [-f <hosts-file> | <comma-separated list of hosts>]]
          [-include [-f <hosts-file> | <comma-separated list of hosts>]]
          [-idleiterations <idleiterations>]

Here is the link to the documentation

ok . let me some time to play with this

second can you help me with my last thred - https://community.hortonworks.com/questions/231336/kafka-broker-does-not-restart.html

Michael-Bronson

New Contributor

The balancer command is working between nodes, but is there any command to balance space between partitions in the same node ?

We were facing an issue with a partition was read only and once it was fixed the space was increasing but not balanced