Support Questions
Find answers, ask questions, and share your expertise

How to avoid Hbase region splits everytime RS getting restarted ?

How to avoid Hbase region splits everytime RS getting restarted ?

New Contributor

I'm a cluster administrator where there are over 100 Region servers for HBase and 50 + tables that does Bulk loads , Batch puts from Spark Batch, Streaming , Map-reduce applications. Every time we do some maintenance on RS's , we do notice that most of the tables get a new split after starting RS and that adds to the overall regions count. From Hbase blogs , I see having over 200 regions / RS is not recommended. We run HDP 2.6.5 , so HBase 1.1.2 version. So , my question is

1. What can I as an admin do , to avoid these costly splits ?

2. Should I try addressing this at table property level such as splitpolicy, compressions [Note: Even with constant split policy , i do see regions split even before hitting max hfile limit]

3. Is this addressed in HBase 2.0 or later ? Our capacity planning to add/reduce RS depends on the number of regions in the cluster.

Thanks in advance..!

Happy Hadooping :)

Maha