Support Questions
Find answers, ask questions, and share your expertise

HDFS Balancer not working

New Contributor

I have three servers in my cluster.

--------------------------------------------------------------

* One Server has 500GB capacity and others have 1.4TB.

* host1 has NameNode and DataNode [500GB]

* host2, host3 has only DataNode [1.4TB]

--------------------------------------------------------------


Server with 500GB capacity become full, So I tried hdfs balancer command, but

It always ended with [No block has been moved for 5 iterations. Exiting...]


Below is dfsadmin report info and hdfs balancer log.

If you have any advice, it will be appreciated!!


  • sudo -u hdfs hdfs dfsadmin -report

[root@host1 logs]# sudo -u hdfs hdfs dfsadmin -report

Configured Capacity: 3318966018048 (3.02 TB)

Present Capacity: 3314931861560 (3.01 TB)

DFS Remaining: 1726615451334 (1.57 TB)

DFS Used: 1588316410226 (1.44 TB)

DFS Used%: 47.91%

Replicated Blocks:

Under replicated blocks: 52200

Blocks with corrupt replicas: 0

Missing blocks: 0

Missing blocks (with replication factor 1): 0

Low redundancy blocks with highest priority to recover: 60

Pending deletion blocks: 0

Erasure Coded Block Groups:

Low redundancy block groups: 0

Block groups with corrupt internal blocks: 0

Missing block groups: 0

Low redundancy blocks with highest priority to recover: 0

Pending deletion blocks: 0

-------------------------------------------------

Live datanodes (3):

Name: <host1_ip>:50010 (host1.hadoop.com)

Hostname: host1.hadoop.com

Decommission Status : Normal

Configured Capacity: 508550971392 (473.63 GB)

DFS Used: 508551745536 (473.63 GB)

Non DFS Used: 0 (0 B)

DFS Remaining: 0 (0 B)

DFS Used%: 100.00%

DFS Remaining%: 0.00%

Configured Cache Capacity: 0 (0 B)

Cache Used: 0 (0 B)

Cache Remaining: 0 (0 B)

Cache Used%: 100.00%

Cache Remaining%: 0.00%

Xceivers: 2

Last contact: Tue Jun 18 16:01:18 KST 2019

Last Block Report: Tue Jun 18 11:21:06 KST 2019

Num of Blocks: 240043


Name: <host2_ip>:50010 (host2.hadoop.com)

Hostname: host2.hadoop.com

Decommission Status : Normal

Configured Capacity: 1405207523328 (1.28 TB)

DFS Used: 539944059072 (502.86 GB)

Non DFS Used: 0 (0 B)

DFS Remaining: 863245871380 (803.96 GB)

DFS Used%: 38.42%

DFS Remaining%: 61.43%

Configured Cache Capacity: 0 (0 B)

Cache Used: 0 (0 B)

Cache Remaining: 0 (0 B)

Cache Used%: 100.00%

Cache Remaining%: 0.00%

Xceivers: 36

Last contact: Tue Jun 18 16:01:19 KST 2019

Last Block Report: Tue Jun 18 14:01:58 KST 2019

Num of Blocks: 292247


Name: <host3_ip>:50010 (host3.hadoop.com)

Hostname: host3.hadoop.com

Decommission Status : Normal

Configured Capacity: 1405207523328 (1.28 TB)

DFS Used: 539820605618 (502.75 GB)

Non DFS Used: 0 (0 B)

DFS Remaining: 863369579954 (804.08 GB)

DFS Used%: 38.42%

DFS Remaining%: 61.44%

Configured Cache Capacity: 0 (0 B)

Cache Used: 0 (0 B)

Cache Remaining: 0 (0 B)

Cache Used%: 100.00%

Cache Remaining%: 0.00%

Xceivers: 36

Last contact: Tue Jun 18 16:01:20 KST 2019

Last Block Report: Tue Jun 18 14:03:14 KST 2019

Num of Blocks: 292187



  • sudo -u hdfs hdfs balancer

[root@host1 hdfs]# sudo -u hdfs hdfs balancer

WARNING: HADOOP_BALANCER_OPTS has been replaced by HDFS_BALANCER_OPTS. Using value of HADOOP_BALANCER_OPTS.

19/06/18 14:59:22 INFO balancer.Balancer: namenodes = [hdfs://host1.hadoop.com:8020]

19/06/18 14:59:22 INFO balancer.Balancer: parameters = Balancer.BalancerParameters [BalancingPolicy.Node, threshold = 10.0, max idle iteration = 5, #excluded nodes = 0, #included nodes = 0, #source nodes = 0, #blockpools = 0, run during upgrade = false]

19/06/18 14:59:22 INFO balancer.Balancer: included nodes = []

19/06/18 14:59:22 INFO balancer.Balancer: excluded nodes = []

19/06/18 14:59:22 INFO balancer.Balancer: source nodes = []

Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved

19/06/18 14:59:23 INFO balancer.KeyManager: Block token params received from NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec

19/06/18 14:59:23 INFO block.BlockTokenSecretManager: Setting block keys

19/06/18 14:59:23 INFO balancer.KeyManager: Update block keys every 2hrs, 30mins, 0sec

19/06/18 14:59:23 INFO balancer.Balancer: dfs.balancer.movedWinWidth = 5400000 (default=5400000)

19/06/18 14:59:23 INFO balancer.Balancer: dfs.balancer.moverThreads = 1000 (default=1000)

19/06/18 14:59:23 INFO balancer.Balancer: dfs.balancer.dispatcherThreads = 200 (default=200)

19/06/18 14:59:23 INFO balancer.Balancer: dfs.datanode.balance.max.concurrent.moves = 50 (default=50)

19/06/18 14:59:23 INFO balancer.Balancer: dfs.balancer.getBlocks.size = 2147483648 (default=2147483648)

19/06/18 14:59:23 INFO balancer.Balancer: dfs.balancer.getBlocks.min-block-size = 10485760 (default=10485760)

19/06/18 14:59:23 INFO block.BlockTokenSecretManager: Setting block keys

19/06/18 14:59:23 INFO balancer.Balancer: dfs.balancer.max-size-to-move = 10737418240 (default=10737418240)

19/06/18 14:59:23 INFO balancer.Balancer: dfs.blocksize = 134217728 (default=134217728)

19/06/18 14:59:23 INFO net.NetworkTopology: Adding a new node: /default-rack/<host2_ip>:50010

19/06/18 14:59:23 INFO net.NetworkTopology: Adding a new node: /default-rack/<host3_ip>:50010

19/06/18 14:59:23 INFO net.NetworkTopology: Adding a new node: /default-rack/<host1_ip>:50010

19/06/18 14:59:23 INFO balancer.Balancer: 1 over-utilized: [<host1_ip>:50010:DISK]

19/06/18 14:59:23 INFO balancer.Balancer: 0 underutilized: []

19/06/18 14:59:23 INFO balancer.Balancer: Need to move 200.03 GB to make the cluster balanced.

19/06/18 14:59:23 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: overUtilized => underUtilized

19/06/18 14:59:23 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: overUtilized => belowAvgUtilized

19/06/18 14:59:23 INFO balancer.Balancer: Decided to move 10 GB bytes from <host1_ip>:50010:DISK to <host2_ip>:50010:DISK

19/06/18 14:59:23 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: underUtilized => aboveAvgUtilized

19/06/18 14:59:23 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => underUtilized

19/06/18 14:59:23 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => belowAvgUtilized

19/06/18 14:59:23 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: underUtilized => aboveAvgUtilized

19/06/18 14:59:23 INFO balancer.Balancer: Will move 10 GB in this iteration

19/06/18 14:59:23 INFO balancer.Dispatcher: Limiting threads per target to the specified max.

19/06/18 14:59:23 INFO balancer.Dispatcher: Allocating 50 threads per target.

2019. 6. 18 오후 2:59:23 0 0 B 200.03 GB 10 GB

19/06/18 14:59:32 INFO balancer.Balancer: dfs.balancer.movedWinWidth = 5400000 (default=5400000)

19/06/18 14:59:32 INFO balancer.Balancer: dfs.balancer.moverThreads = 1000 (default=1000)

19/06/18 14:59:32 INFO balancer.Balancer: dfs.balancer.dispatcherThreads = 200 (default=200)

19/06/18 14:59:32 INFO balancer.Balancer: dfs.datanode.balance.max.concurrent.moves = 50 (default=50)

19/06/18 14:59:32 INFO balancer.Balancer: dfs.balancer.getBlocks.size = 2147483648 (default=2147483648)

19/06/18 14:59:32 INFO balancer.Balancer: dfs.balancer.getBlocks.min-block-size = 10485760 (default=10485760)

19/06/18 14:59:32 INFO balancer.Balancer: dfs.balancer.max-size-to-move = 10737418240 (default=10737418240)

19/06/18 14:59:32 INFO balancer.Balancer: dfs.blocksize = 134217728 (default=134217728)

19/06/18 14:59:32 INFO net.NetworkTopology: Adding a new node: /default-rack/<host1_ip>:50010

19/06/18 14:59:32 INFO net.NetworkTopology: Adding a new node: /default-rack/<host2_ip>:50010

19/06/18 14:59:32 INFO net.NetworkTopology: Adding a new node: /default-rack/<host3_ip>:50010

19/06/18 14:59:32 INFO balancer.Balancer: 1 over-utilized: [<host1_ip>:50010:DISK]

19/06/18 14:59:32 INFO balancer.Balancer: 0 underutilized: []

19/06/18 14:59:32 INFO balancer.Balancer: Need to move 200.03 GB to make the cluster balanced.

19/06/18 14:59:33 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: overUtilized => underUtilized

19/06/18 14:59:33 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: overUtilized => belowAvgUtilized

19/06/18 14:59:33 INFO balancer.Balancer: Decided to move 10 GB bytes from <host1_ip>:50010:DISK to <host2_ip>:50010:DISK

19/06/18 14:59:33 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: underUtilized => aboveAvgUtilized

19/06/18 14:59:33 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => underUtilized

19/06/18 14:59:33 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => belowAvgUtilized

19/06/18 14:59:33 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: underUtilized => aboveAvgUtilized

19/06/18 14:59:33 INFO balancer.Balancer: Will move 10 GB in this iteration

19/06/18 14:59:33 INFO balancer.Dispatcher: Limiting threads per target to the specified max.

19/06/18 14:59:33 INFO balancer.Dispatcher: Allocating 50 threads per target.

2019. 6. 18 오후 2:59:33 1 0 B 200.03 GB 10 GB

19/06/18 14:59:42 INFO balancer.Balancer: dfs.balancer.movedWinWidth = 5400000 (default=5400000)

19/06/18 14:59:42 INFO balancer.Balancer: dfs.balancer.moverThreads = 1000 (default=1000)

19/06/18 14:59:42 INFO balancer.Balancer: dfs.balancer.dispatcherThreads = 200 (default=200)

19/06/18 14:59:42 INFO balancer.Balancer: dfs.datanode.balance.max.concurrent.moves = 50 (default=50)

19/06/18 14:59:42 INFO balancer.Balancer: dfs.balancer.getBlocks.size = 2147483648 (default=2147483648)

19/06/18 14:59:42 INFO balancer.Balancer: dfs.balancer.getBlocks.min-block-size = 10485760 (default=10485760)

19/06/18 14:59:42 INFO balancer.Balancer: dfs.balancer.max-size-to-move = 10737418240 (default=10737418240)

19/06/18 14:59:42 INFO balancer.Balancer: dfs.blocksize = 134217728 (default=134217728)

19/06/18 14:59:42 INFO net.NetworkTopology: Adding a new node: /default-rack/<host2_ip>:50010

19/06/18 14:59:42 INFO net.NetworkTopology: Adding a new node: /default-rack/<host1_ip>:50010

19/06/18 14:59:42 INFO net.NetworkTopology: Adding a new node: /default-rack/<host3_ip>:50010

19/06/18 14:59:42 INFO balancer.Balancer: 1 over-utilized: [<host1_ip>:50010:DISK]

19/06/18 14:59:42 INFO balancer.Balancer: 0 underutilized: []

19/06/18 14:59:42 INFO balancer.Balancer: Need to move 200.03 GB to make the cluster balanced.

19/06/18 14:59:42 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: overUtilized => underUtilized

19/06/18 14:59:42 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: overUtilized => belowAvgUtilized

19/06/18 14:59:42 INFO balancer.Balancer: Decided to move 10 GB bytes from <host1_ip>:50010:DISK to <host2_ip>:50010:DISK

19/06/18 14:59:42 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: underUtilized => aboveAvgUtilized

19/06/18 14:59:42 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => underUtilized

19/06/18 14:59:42 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => belowAvgUtilized

19/06/18 14:59:42 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: underUtilized => aboveAvgUtilized

19/06/18 14:59:42 INFO balancer.Balancer: Will move 10 GB in this iteration

19/06/18 14:59:42 INFO balancer.Dispatcher: Limiting threads per target to the specified max.

19/06/18 14:59:42 INFO balancer.Dispatcher: Allocating 50 threads per target.

2019. 6. 18 오후 2:59:42 2 0 B 200.03 GB 10 GB

19/06/18 14:59:51 INFO balancer.Balancer: dfs.balancer.movedWinWidth = 5400000 (default=5400000)

19/06/18 14:59:51 INFO balancer.Balancer: dfs.balancer.moverThreads = 1000 (default=1000)

19/06/18 14:59:51 INFO balancer.Balancer: dfs.balancer.dispatcherThreads = 200 (default=200)

19/06/18 14:59:51 INFO balancer.Balancer: dfs.datanode.balance.max.concurrent.moves = 50 (default=50)

19/06/18 14:59:51 INFO balancer.Balancer: dfs.balancer.getBlocks.size = 2147483648 (default=2147483648)

19/06/18 14:59:51 INFO balancer.Balancer: dfs.balancer.getBlocks.min-block-size = 10485760 (default=10485760)

19/06/18 14:59:51 INFO balancer.Balancer: dfs.balancer.max-size-to-move = 10737418240 (default=10737418240)

19/06/18 14:59:51 INFO balancer.Balancer: dfs.blocksize = 134217728 (default=134217728)

19/06/18 14:59:51 INFO net.NetworkTopology: Adding a new node: /default-rack/<host1_ip>:50010

19/06/18 14:59:51 INFO net.NetworkTopology: Adding a new node: /default-rack/<host2_ip>:50010

19/06/18 14:59:51 INFO net.NetworkTopology: Adding a new node: /default-rack/<host3_ip>:50010

19/06/18 14:59:51 INFO balancer.Balancer: 1 over-utilized: [<host1_ip>:50010:DISK]

19/06/18 14:59:51 INFO balancer.Balancer: 0 underutilized: []

19/06/18 14:59:51 INFO balancer.Balancer: Need to move 200.03 GB to make the cluster balanced.

19/06/18 14:59:51 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: overUtilized => underUtilized

19/06/18 14:59:51 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: overUtilized => belowAvgUtilized

19/06/18 14:59:51 INFO balancer.Balancer: Decided to move 10 GB bytes from <host1_ip>:50010:DISK to <host2_ip>:50010:DISK

19/06/18 14:59:51 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: underUtilized => aboveAvgUtilized

19/06/18 14:59:51 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => underUtilized

19/06/18 14:59:51 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => belowAvgUtilized

19/06/18 14:59:51 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: underUtilized => aboveAvgUtilized

19/06/18 14:59:51 INFO balancer.Balancer: Will move 10 GB in this iteration

19/06/18 14:59:51 INFO balancer.Dispatcher: Limiting threads per target to the specified max.

19/06/18 14:59:51 INFO balancer.Dispatcher: Allocating 50 threads per target.

2019. 6. 18 오후 2:59:51 3 0 B 200.03 GB 10 GB

19/06/18 15:00:00 INFO balancer.Balancer: dfs.balancer.movedWinWidth = 5400000 (default=5400000)

19/06/18 15:00:00 INFO balancer.Balancer: dfs.balancer.moverThreads = 1000 (default=1000)

19/06/18 15:00:00 INFO balancer.Balancer: dfs.balancer.dispatcherThreads = 200 (default=200)

19/06/18 15:00:00 INFO balancer.Balancer: dfs.datanode.balance.max.concurrent.moves = 50 (default=50)

19/06/18 15:00:00 INFO balancer.Balancer: dfs.balancer.getBlocks.size = 2147483648 (default=2147483648)

19/06/18 15:00:00 INFO balancer.Balancer: dfs.balancer.getBlocks.min-block-size = 10485760 (default=10485760)

19/06/18 15:00:00 INFO balancer.Balancer: dfs.balancer.max-size-to-move = 10737418240 (default=10737418240)

19/06/18 15:00:00 INFO balancer.Balancer: dfs.blocksize = 134217728 (default=134217728)

19/06/18 15:00:00 INFO net.NetworkTopology: Adding a new node: /default-rack/<host1_ip>:50010

19/06/18 15:00:00 INFO net.NetworkTopology: Adding a new node: /default-rack/<host2_ip>:50010

19/06/18 15:00:00 INFO net.NetworkTopology: Adding a new node: /default-rack/<host3_ip>:50010

19/06/18 15:00:00 INFO balancer.Balancer: 1 over-utilized: [<host1_ip>:50010:DISK]

19/06/18 15:00:00 INFO balancer.Balancer: 0 underutilized: []

19/06/18 15:00:00 INFO balancer.Balancer: Need to move 200.03 GB to make the cluster balanced.

19/06/18 15:00:00 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: overUtilized => underUtilized

19/06/18 15:00:00 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: overUtilized => belowAvgUtilized

19/06/18 15:00:00 INFO balancer.Balancer: Decided to move 10 GB bytes from <host1_ip>:50010:DISK to <host2_ip>:50010:DISK

19/06/18 15:00:00 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: underUtilized => aboveAvgUtilized

19/06/18 15:00:00 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => underUtilized

19/06/18 15:00:00 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => belowAvgUtilized

19/06/18 15:00:00 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: underUtilized => aboveAvgUtilized

19/06/18 15:00:00 INFO balancer.Balancer: Will move 10 GB in this iteration

19/06/18 15:00:00 INFO balancer.Dispatcher: Limiting threads per target to the specified max.

19/06/18 15:00:00 INFO balancer.Dispatcher: Allocating 50 threads per target.

No block has been moved for 5 iterations. Exiting...

2019. 6. 18 오후 3:00:00 4 0 B 200.03 GB 10 GB

2019. 6. 18 오후 3:00:00 Balancing took 37.852 seconds



0 REPLIES 0