Support Questions

avengers · ‎08-23-2018

I use cloudera cdh 4.0.4.

I run balancing on Hbase.

However, I have 10 data nodes, and only 5 servers are being used as hbase region servers.

Data node imbalance has occurred.

Is there a possibility that Hbase will cause problems when balancing with Hadoop hdfs?

Harsh J · ‎08-23-2018

There will not be any operational problems such as crashes or errors when
running a HDFS balancer on a cluster with HBase running, but there can
potentially be a performance impact depending on what the balancer decides
to move based on its space thresholds.

The performance impact would come from loss of locality - the
RegionServers' required HFiles may find their blocks to be remote, so a
slightly higher network usage can be observed until the next major
compaction rewrites a block replica locally.

If you'd like to narrow down the time-frame of impact, you can run the HDFS
balancer with the desired balancing threshold, and then once it is
complete, immediately follow up with a major compaction command on your
latency-sensitive HBase tables.

View solution in original post

Harsh J · ‎08-23-2018

There will not be any operational problems such as crashes or errors when
running a HDFS balancer on a cluster with HBase running, but there can
potentially be a performance impact depending on what the balancer decides
to move based on its space thresholds.

The performance impact would come from loss of locality - the
RegionServers' required HFiles may find their blocks to be remote, so a
slightly higher network usage can be observed until the next major
compaction rewrites a block replica locally.

If you'd like to narrow down the time-frame of impact, you can run the HDFS
balancer with the desired balancing threshold, and then once it is
complete, immediately follow up with a major compaction command on your
latency-sensitive HBase tables.

Cloudera Community

Support Questions

Can I run the balancer for hdfs?

HDFS Balancer (2): Configurations & CLI Options

HDFS Balancer (3): Cluster Balancing Algorithm

HBase and HDFS Balancer

HDFS Balancer: Balancing Data Between Disks on a D...

HDFS Balancer (1): 100x Performance Improvement

HDFS Balancer and Kerberos Ticket Renewal

How to increase HDFS Balancer network bandwidth fo...

Cache Aware Load Balancer in Apache HBase

Help with exception from HDFS balancer

Load Balance Zookeeper using HAProxy