Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

HDFS Balancer not working

Highlighted

HDFS Balancer not working

New Contributor

I have three servers in my cluster.

--------------------------------------------------------------

* One Server has 500GB capacity and others have 1.4TB.

* host1 has NameNode and DataNode [500GB]

* host2, host3 has only DataNode [1.4TB]

--------------------------------------------------------------


Server with 500GB capacity become full, So I tried hdfs balancer command, but

It always ended with [No block has been moved for 5 iterations. Exiting...]


Below is dfsadmin report info and hdfs balancer log.

If you have any advice, it will be appreciated!!


  • sudo -u hdfs hdfs dfsadmin -report

[root@host1 logs]# sudo -u hdfs hdfs dfsadmin -report

Configured Capacity: 3318966018048 (3.02 TB)

Present Capacity: 3314931861560 (3.01 TB)

DFS Remaining: 1726615451334 (1.57 TB)

DFS Used: 1588316410226 (1.44 TB)

DFS Used%: 47.91%

Replicated Blocks:

Under replicated blocks: 52200

Blocks with corrupt replicas: 0

Missing blocks: 0

Missing blocks (with replication factor 1): 0

Low redundancy blocks with highest priority to recover: 60

Pending deletion blocks: 0

Erasure Coded Block Groups:

Low redundancy block groups: 0

Block groups with corrupt internal blocks: 0

Missing block groups: 0

Low redundancy blocks with highest priority to recover: 0

Pending deletion blocks: 0

-------------------------------------------------

Live datanodes (3):

Name: <host1_ip>:50010 (host1.hadoop.com)

Hostname: host1.hadoop.com

Decommission Status : Normal

Configured Capacity: 508550971392 (473.63 GB)

DFS Used: 508551745536 (473.63 GB)

Non DFS Used: 0 (0 B)

DFS Remaining: 0 (0 B)

DFS Used%: 100.00%

DFS Remaining%: 0.00%

Configured Cache Capacity: 0 (0 B)

Cache Used: 0 (0 B)

Cache Remaining: 0 (0 B)

Cache Used%: 100.00%

Cache Remaining%: 0.00%

Xceivers: 2

Last contact: Tue Jun 18 16:01:18 KST 2019

Last Block Report: Tue Jun 18 11:21:06 KST 2019

Num of Blocks: 240043


Name: <host2_ip>:50010 (host2.hadoop.com)

Hostname: host2.hadoop.com

Decommission Status : Normal

Configured Capacity: 1405207523328 (1.28 TB)

DFS Used: 539944059072 (502.86 GB)

Non DFS Used: 0 (0 B)

DFS Remaining: 863245871380 (803.96 GB)

DFS Used%: 38.42%

DFS Remaining%: 61.43%

Configured Cache Capacity: 0 (0 B)

Cache Used: 0 (0 B)

Cache Remaining: 0 (0 B)

Cache Used%: 100.00%

Cache Remaining%: 0.00%

Xceivers: 36

Last contact: Tue Jun 18 16:01:19 KST 2019

Last Block Report: Tue Jun 18 14:01:58 KST 2019

Num of Blocks: 292247


Name: <host3_ip>:50010 (host3.hadoop.com)

Hostname: host3.hadoop.com

Decommission Status : Normal

Configured Capacity: 1405207523328 (1.28 TB)

DFS Used: 539820605618 (502.75 GB)

Non DFS Used: 0 (0 B)

DFS Remaining: 863369579954 (804.08 GB)

DFS Used%: 38.42%

DFS Remaining%: 61.44%

Configured Cache Capacity: 0 (0 B)

Cache Used: 0 (0 B)

Cache Remaining: 0 (0 B)

Cache Used%: 100.00%

Cache Remaining%: 0.00%

Xceivers: 36

Last contact: Tue Jun 18 16:01:20 KST 2019

Last Block Report: Tue Jun 18 14:03:14 KST 2019

Num of Blocks: 292187



  • sudo -u hdfs hdfs balancer

[root@host1 hdfs]# sudo -u hdfs hdfs balancer

WARNING: HADOOP_BALANCER_OPTS has been replaced by HDFS_BALANCER_OPTS. Using value of HADOOP_BALANCER_OPTS.

19/06/18 14:59:22 INFO balancer.Balancer: namenodes = [hdfs://host1.hadoop.com:8020]

19/06/18 14:59:22 INFO balancer.Balancer: parameters = Balancer.BalancerParameters [BalancingPolicy.Node, threshold = 10.0, max idle iteration = 5, #excluded nodes = 0, #included nodes = 0, #source nodes = 0, #blockpools = 0, run during upgrade = false]

19/06/18 14:59:22 INFO balancer.Balancer: included nodes = []

19/06/18 14:59:22 INFO balancer.Balancer: excluded nodes = []

19/06/18 14:59:22 INFO balancer.Balancer: source nodes = []

Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved

19/06/18 14:59:23 INFO balancer.KeyManager: Block token params received from NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec

19/06/18 14:59:23 INFO block.BlockTokenSecretManager: Setting block keys

19/06/18 14:59:23 INFO balancer.KeyManager: Update block keys every 2hrs, 30mins, 0sec

19/06/18 14:59:23 INFO balancer.Balancer: dfs.balancer.movedWinWidth = 5400000 (default=5400000)

19/06/18 14:59:23 INFO balancer.Balancer: dfs.balancer.moverThreads = 1000 (default=1000)

19/06/18 14:59:23 INFO balancer.Balancer: dfs.balancer.dispatcherThreads = 200 (default=200)

19/06/18 14:59:23 INFO balancer.Balancer: dfs.datanode.balance.max.concurrent.moves = 50 (default=50)

19/06/18 14:59:23 INFO balancer.Balancer: dfs.balancer.getBlocks.size = 2147483648 (default=2147483648)

19/06/18 14:59:23 INFO balancer.Balancer: dfs.balancer.getBlocks.min-block-size = 10485760 (default=10485760)

19/06/18 14:59:23 INFO block.BlockTokenSecretManager: Setting block keys

19/06/18 14:59:23 INFO balancer.Balancer: dfs.balancer.max-size-to-move = 10737418240 (default=10737418240)

19/06/18 14:59:23 INFO balancer.Balancer: dfs.blocksize = 134217728 (default=134217728)

19/06/18 14:59:23 INFO net.NetworkTopology: Adding a new node: /default-rack/<host2_ip>:50010

19/06/18 14:59:23 INFO net.NetworkTopology: Adding a new node: /default-rack/<host3_ip>:50010

19/06/18 14:59:23 INFO net.NetworkTopology: Adding a new node: /default-rack/<host1_ip>:50010

19/06/18 14:59:23 INFO balancer.Balancer: 1 over-utilized: [<host1_ip>:50010:DISK]

19/06/18 14:59:23 INFO balancer.Balancer: 0 underutilized: []

19/06/18 14:59:23 INFO balancer.Balancer: Need to move 200.03 GB to make the cluster balanced.

19/06/18 14:59:23 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: overUtilized => underUtilized

19/06/18 14:59:23 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: overUtilized => belowAvgUtilized

19/06/18 14:59:23 INFO balancer.Balancer: Decided to move 10 GB bytes from <host1_ip>:50010:DISK to <host2_ip>:50010:DISK

19/06/18 14:59:23 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: underUtilized => aboveAvgUtilized

19/06/18 14:59:23 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => underUtilized

19/06/18 14:59:23 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => belowAvgUtilized

19/06/18 14:59:23 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: underUtilized => aboveAvgUtilized

19/06/18 14:59:23 INFO balancer.Balancer: Will move 10 GB in this iteration

19/06/18 14:59:23 INFO balancer.Dispatcher: Limiting threads per target to the specified max.

19/06/18 14:59:23 INFO balancer.Dispatcher: Allocating 50 threads per target.

2019. 6. 18 오후 2:59:23 0 0 B 200.03 GB 10 GB

19/06/18 14:59:32 INFO balancer.Balancer: dfs.balancer.movedWinWidth = 5400000 (default=5400000)

19/06/18 14:59:32 INFO balancer.Balancer: dfs.balancer.moverThreads = 1000 (default=1000)

19/06/18 14:59:32 INFO balancer.Balancer: dfs.balancer.dispatcherThreads = 200 (default=200)

19/06/18 14:59:32 INFO balancer.Balancer: dfs.datanode.balance.max.concurrent.moves = 50 (default=50)

19/06/18 14:59:32 INFO balancer.Balancer: dfs.balancer.getBlocks.size = 2147483648 (default=2147483648)

19/06/18 14:59:32 INFO balancer.Balancer: dfs.balancer.getBlocks.min-block-size = 10485760 (default=10485760)

19/06/18 14:59:32 INFO balancer.Balancer: dfs.balancer.max-size-to-move = 10737418240 (default=10737418240)

19/06/18 14:59:32 INFO balancer.Balancer: dfs.blocksize = 134217728 (default=134217728)

19/06/18 14:59:32 INFO net.NetworkTopology: Adding a new node: /default-rack/<host1_ip>:50010

19/06/18 14:59:32 INFO net.NetworkTopology: Adding a new node: /default-rack/<host2_ip>:50010

19/06/18 14:59:32 INFO net.NetworkTopology: Adding a new node: /default-rack/<host3_ip>:50010

19/06/18 14:59:32 INFO balancer.Balancer: 1 over-utilized: [<host1_ip>:50010:DISK]

19/06/18 14:59:32 INFO balancer.Balancer: 0 underutilized: []

19/06/18 14:59:32 INFO balancer.Balancer: Need to move 200.03 GB to make the cluster balanced.

19/06/18 14:59:33 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: overUtilized => underUtilized

19/06/18 14:59:33 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: overUtilized => belowAvgUtilized

19/06/18 14:59:33 INFO balancer.Balancer: Decided to move 10 GB bytes from <host1_ip>:50010:DISK to <host2_ip>:50010:DISK

19/06/18 14:59:33 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: underUtilized => aboveAvgUtilized

19/06/18 14:59:33 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => underUtilized

19/06/18 14:59:33 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => belowAvgUtilized

19/06/18 14:59:33 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: underUtilized => aboveAvgUtilized

19/06/18 14:59:33 INFO balancer.Balancer: Will move 10 GB in this iteration

19/06/18 14:59:33 INFO balancer.Dispatcher: Limiting threads per target to the specified max.

19/06/18 14:59:33 INFO balancer.Dispatcher: Allocating 50 threads per target.

2019. 6. 18 오후 2:59:33 1 0 B 200.03 GB 10 GB

19/06/18 14:59:42 INFO balancer.Balancer: dfs.balancer.movedWinWidth = 5400000 (default=5400000)

19/06/18 14:59:42 INFO balancer.Balancer: dfs.balancer.moverThreads = 1000 (default=1000)

19/06/18 14:59:42 INFO balancer.Balancer: dfs.balancer.dispatcherThreads = 200 (default=200)

19/06/18 14:59:42 INFO balancer.Balancer: dfs.datanode.balance.max.concurrent.moves = 50 (default=50)

19/06/18 14:59:42 INFO balancer.Balancer: dfs.balancer.getBlocks.size = 2147483648 (default=2147483648)

19/06/18 14:59:42 INFO balancer.Balancer: dfs.balancer.getBlocks.min-block-size = 10485760 (default=10485760)

19/06/18 14:59:42 INFO balancer.Balancer: dfs.balancer.max-size-to-move = 10737418240 (default=10737418240)

19/06/18 14:59:42 INFO balancer.Balancer: dfs.blocksize = 134217728 (default=134217728)

19/06/18 14:59:42 INFO net.NetworkTopology: Adding a new node: /default-rack/<host2_ip>:50010

19/06/18 14:59:42 INFO net.NetworkTopology: Adding a new node: /default-rack/<host1_ip>:50010

19/06/18 14:59:42 INFO net.NetworkTopology: Adding a new node: /default-rack/<host3_ip>:50010

19/06/18 14:59:42 INFO balancer.Balancer: 1 over-utilized: [<host1_ip>:50010:DISK]

19/06/18 14:59:42 INFO balancer.Balancer: 0 underutilized: []

19/06/18 14:59:42 INFO balancer.Balancer: Need to move 200.03 GB to make the cluster balanced.

19/06/18 14:59:42 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: overUtilized => underUtilized

19/06/18 14:59:42 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: overUtilized => belowAvgUtilized

19/06/18 14:59:42 INFO balancer.Balancer: Decided to move 10 GB bytes from <host1_ip>:50010:DISK to <host2_ip>:50010:DISK

19/06/18 14:59:42 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: underUtilized => aboveAvgUtilized

19/06/18 14:59:42 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => underUtilized

19/06/18 14:59:42 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => belowAvgUtilized

19/06/18 14:59:42 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: underUtilized => aboveAvgUtilized

19/06/18 14:59:42 INFO balancer.Balancer: Will move 10 GB in this iteration

19/06/18 14:59:42 INFO balancer.Dispatcher: Limiting threads per target to the specified max.

19/06/18 14:59:42 INFO balancer.Dispatcher: Allocating 50 threads per target.

2019. 6. 18 오후 2:59:42 2 0 B 200.03 GB 10 GB

19/06/18 14:59:51 INFO balancer.Balancer: dfs.balancer.movedWinWidth = 5400000 (default=5400000)

19/06/18 14:59:51 INFO balancer.Balancer: dfs.balancer.moverThreads = 1000 (default=1000)

19/06/18 14:59:51 INFO balancer.Balancer: dfs.balancer.dispatcherThreads = 200 (default=200)

19/06/18 14:59:51 INFO balancer.Balancer: dfs.datanode.balance.max.concurrent.moves = 50 (default=50)

19/06/18 14:59:51 INFO balancer.Balancer: dfs.balancer.getBlocks.size = 2147483648 (default=2147483648)

19/06/18 14:59:51 INFO balancer.Balancer: dfs.balancer.getBlocks.min-block-size = 10485760 (default=10485760)

19/06/18 14:59:51 INFO balancer.Balancer: dfs.balancer.max-size-to-move = 10737418240 (default=10737418240)

19/06/18 14:59:51 INFO balancer.Balancer: dfs.blocksize = 134217728 (default=134217728)

19/06/18 14:59:51 INFO net.NetworkTopology: Adding a new node: /default-rack/<host1_ip>:50010

19/06/18 14:59:51 INFO net.NetworkTopology: Adding a new node: /default-rack/<host2_ip>:50010

19/06/18 14:59:51 INFO net.NetworkTopology: Adding a new node: /default-rack/<host3_ip>:50010

19/06/18 14:59:51 INFO balancer.Balancer: 1 over-utilized: [<host1_ip>:50010:DISK]

19/06/18 14:59:51 INFO balancer.Balancer: 0 underutilized: []

19/06/18 14:59:51 INFO balancer.Balancer: Need to move 200.03 GB to make the cluster balanced.

19/06/18 14:59:51 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: overUtilized => underUtilized

19/06/18 14:59:51 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: overUtilized => belowAvgUtilized

19/06/18 14:59:51 INFO balancer.Balancer: Decided to move 10 GB bytes from <host1_ip>:50010:DISK to <host2_ip>:50010:DISK

19/06/18 14:59:51 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: underUtilized => aboveAvgUtilized

19/06/18 14:59:51 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => underUtilized

19/06/18 14:59:51 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => belowAvgUtilized

19/06/18 14:59:51 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: underUtilized => aboveAvgUtilized

19/06/18 14:59:51 INFO balancer.Balancer: Will move 10 GB in this iteration

19/06/18 14:59:51 INFO balancer.Dispatcher: Limiting threads per target to the specified max.

19/06/18 14:59:51 INFO balancer.Dispatcher: Allocating 50 threads per target.

2019. 6. 18 오후 2:59:51 3 0 B 200.03 GB 10 GB

19/06/18 15:00:00 INFO balancer.Balancer: dfs.balancer.movedWinWidth = 5400000 (default=5400000)

19/06/18 15:00:00 INFO balancer.Balancer: dfs.balancer.moverThreads = 1000 (default=1000)

19/06/18 15:00:00 INFO balancer.Balancer: dfs.balancer.dispatcherThreads = 200 (default=200)

19/06/18 15:00:00 INFO balancer.Balancer: dfs.datanode.balance.max.concurrent.moves = 50 (default=50)

19/06/18 15:00:00 INFO balancer.Balancer: dfs.balancer.getBlocks.size = 2147483648 (default=2147483648)

19/06/18 15:00:00 INFO balancer.Balancer: dfs.balancer.getBlocks.min-block-size = 10485760 (default=10485760)

19/06/18 15:00:00 INFO balancer.Balancer: dfs.balancer.max-size-to-move = 10737418240 (default=10737418240)

19/06/18 15:00:00 INFO balancer.Balancer: dfs.blocksize = 134217728 (default=134217728)

19/06/18 15:00:00 INFO net.NetworkTopology: Adding a new node: /default-rack/<host1_ip>:50010

19/06/18 15:00:00 INFO net.NetworkTopology: Adding a new node: /default-rack/<host2_ip>:50010

19/06/18 15:00:00 INFO net.NetworkTopology: Adding a new node: /default-rack/<host3_ip>:50010

19/06/18 15:00:00 INFO balancer.Balancer: 1 over-utilized: [<host1_ip>:50010:DISK]

19/06/18 15:00:00 INFO balancer.Balancer: 0 underutilized: []

19/06/18 15:00:00 INFO balancer.Balancer: Need to move 200.03 GB to make the cluster balanced.

19/06/18 15:00:00 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: overUtilized => underUtilized

19/06/18 15:00:00 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: overUtilized => belowAvgUtilized

19/06/18 15:00:00 INFO balancer.Balancer: Decided to move 10 GB bytes from <host1_ip>:50010:DISK to <host2_ip>:50010:DISK

19/06/18 15:00:00 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: underUtilized => aboveAvgUtilized

19/06/18 15:00:00 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => underUtilized

19/06/18 15:00:00 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => belowAvgUtilized

19/06/18 15:00:00 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: underUtilized => aboveAvgUtilized

19/06/18 15:00:00 INFO balancer.Balancer: Will move 10 GB in this iteration

19/06/18 15:00:00 INFO balancer.Dispatcher: Limiting threads per target to the specified max.

19/06/18 15:00:00 INFO balancer.Dispatcher: Allocating 50 threads per target.

No block has been moved for 5 iterations. Exiting...

2019. 6. 18 오후 3:00:00 4 0 B 200.03 GB 10 GB

2019. 6. 18 오후 3:00:00 Balancing took 37.852 seconds