Posts: 19
Registered: ‎08-23-2018
Accepted Solution

Can I run the balancer for hdfs?

I use cloudera cdh 4.0.4.

I run balancing on Hbase.

However, I have 10 data nodes, and only 5 servers are being used as hbase region servers.

Data node imbalance has occurred.

Is there a possibility that Hbase will cause problems when balancing with Hadoop hdfs?

Posts: 1,825
Kudos: 406
Solutions: 292
Registered: ‎07-31-2013

Re: Can I run the balancer for hdfs

There will not be any operational problems such as crashes or errors when
running a HDFS balancer on a cluster with HBase running, but there can
potentially be a performance impact depending on what the balancer decides
to move based on its space thresholds.

The performance impact would come from loss of locality - the
RegionServers' required HFiles may find their blocks to be remote, so a
slightly higher network usage can be observed until the next major
compaction rewrites a block replica locally.

If you'd like to narrow down the time-frame of impact, you can run the HDFS
balancer with the desired balancing threshold, and then once it is
complete, immediately follow up with a major compaction command on your
latency-sensitive HBase tables.