Reply
Explorer
Posts: 8
Registered: ‎06-14-2016

HDFS re-balance only moves few blocks in one iteration

I have new CDH 5.7 cluster.

I noticed that the rebalance utility is not functioning right (IMO)

On CDH 4.7 the rebalance moves 10GB

On the new cluster i have something like this :

 

 

Decided to move 10 GB bytes from xxx.xxx.xx.17:50010:DISK to xxx.xxx.xx.25:50010:DISK
16/06/13 20:38:19 INFO balancer.Dispatcher: Successfully moved blk_1073795360_54545 with size=92929052 from xxx.xxx.xxx.xxx:50010:DISK to xxx.xxx.xxx.xxx:50010:DISK through xxx.xxx.xxx.24:50010
16/06/13 20:38:21 INFO balancer.Dispatcher: Successfully moved blk_1073795356_54541 with size=90250107 from xxx.xxx.xxx.xxx:50010:DISK to xxx.xxx.xxx.xxx:50010:DISK through xxx.xxx.xxx.20:50010
16/06/13 20:38:21 INFO balancer.Dispatcher: Successfully moved blk_1073795364_54549 with size=97352302 from xxx.xxx.xxx.xxx:50010:DISK to xxx.xxx.xxx.xxx:50010:DISK through xxx.xxx.xxx..22:50010
16/06/13 20:38:22 INFO balancer.Dispatcher: Successfully moved blk_1073795348_54533 with size=134217728 from xxx.xxx.xxx.xxx:50010:DISK to xxx.xxx.xxx.xxx:50010:DISK through xxx.xxx.xxx.20:50010
16/06/13 20:38:22 INFO balancer.Dispatcher: Successfully moved blk_1073795340_54525 with size=134217728 from xxx.xxx.xxx.xxx:50010:DISK to xxx.xxx.xxx.xxx:50010:DISK through xxx.xxx.xxx..22:50010

 

and then a new cycle( after like 20 sec sleep...).

It is moving like ~500MB in one iteration and not 10GB

Other than that, it is working.

 

The only thing i changed, is the re-balance bandwidth:

dfs.balance.bandwidthPerSec, dfs.datanode.balance.bandwidthPerSec

 

Any ideas how to improve this ?

Thank you.

Posts: 1,892
Kudos: 432
Solutions: 302
Registered: ‎07-31-2013

Re: HDFS re-balance only moves few blocks in one iteration

Did you increase the bandwidth or reduce it? If the blocks are not moving fast enough, either due to bandwidth limits, physical network limits or due to lower parallelism of DN+Balancer config of "dfs.datanode.balance.max.concurrent.moves", then the iteration is designed to end prematurely and retry with newer metadata to avoid incorrect commands due to excessive time taken.
Explorer
Posts: 8
Registered: ‎06-14-2016

Re: HDFS re-balance only moves few blocks in one iteration

I increased it to 60MB/s

 

I have this 

<property>
<name>dfs.datanode.balance.max.concurrent.moves</name>
<value>300</value>
</property>

in 

Balancer Default Group / Advanced	Balancer Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml

Not sure if it's the right place, but this is the only posibility i see.

 

The block movement seems fast. It is just moving few blocks..

Posts: 1,892
Kudos: 432
Solutions: 302
Registered: ‎07-31-2013

Re: HDFS re-balance only moves few blocks in one iteration

That property needs to go both in the Balancer config (which you have
correctly placed) and also in the DataNode config (with restart/rolling
restart done to get it picked up). Changing it just on one place
(especially just the balancer, which would naturally cause its requests to
get rejected instead) would not help resolve that situation.
Highlighted
Explorer
Posts: 8
Registered: ‎06-14-2016

Re: HDFS re-balance only moves few blocks in one iteration

[ Edited ]

Now i have the snippet

<property>
<name>dfs.datanode.balance.max.concurrent.moves</name>
<value>300</value>
</property>

in

DataNode Default Group / Advanced	DataNode Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml

and restarted all datanodes, plus deployed client configuration.

 

No effect though...

Explorer
Posts: 8
Registered: ‎06-14-2016

Re: HDFS re-balance only moves few blocks in one iteration

Also, i was expecting that this option will appear in hdfs-site.xml, but it didn't. Is this normal ?
Announcements