Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to keep data locality after a HBase RegionServers rolling restart?

avatar
Contributor

Hi,

I noticed that after performing a rolling restart the data locality for the entire cluster goes down to 20% which is bad and for realtime applications this can be a nightmare.

I've read here that we should switch off the balancer before perform a manual rolling restart on HBase. However, I used the Ambari rolling restart and I didn't see any reference to the balancer in the documentation. Maybe the balancer is not the issue, what is the safest way to perform a rolling restart on all region servers but keeping the data locality at least above 75%. Is there any option on Ambari to take care of that before a RS Rolling Restart.

Another issue that I noticed is that some regions have split during the Rolling Restart but they are bit far for being full.

Any insights?

Thank you,

Cheers

Pedro

1 ACCEPTED SOLUTION

avatar
Master Mentor

@Pedro Gandola splitting occurs when your regions grow to the max size (hbase.hregion.max.filesize) as defined in your hbase-site.xml

http://hbase.apache.org/book.html#disable.splitting

when you run major compaction, the data locality is restored. Run major compactions on a busy system in off-peak hours.

balancer distributes regions across the cluster, runs every 5 minutes by default, do not turn it off. You can implement your own balancer and replace the default StochasticLoadBalancer class, not recommended unless you know what you're doing.

Another option is to enable read replicas, so essentially you're duplicating data in a different region server. The secondary replicas are read-only and maximize your data availablity.

All in all, it's more art than science and you need to experiment with many hbase properties to get an ultimate result.

View solution in original post

11 REPLIES 11

avatar
New Contributor

@Pedro Gandola Hi, Did you solve this issue using ConstantSizeRegionSplitPolicy?

avatar
Contributor

Hi @Minwoo Kang, Yes, that solved the problem.