Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to avoid Hbase region splits everytime RS getting restarted ?

How to avoid Hbase region splits everytime RS getting restarted ?

New Contributor

I'm a cluster administrator where there are over 100 Region servers for HBase and 50 + tables that does Bulk loads , Batch puts from Spark Batch, Streaming , Map-reduce applications. Every time we do some maintenance on RS's , we do notice that most of the tables get a new split after starting RS and that adds to the overall regions count. From Hbase blogs , I see having over 200 regions / RS is not recommended. We run HDP 2.6.5 , so HBase 1.1.2 version. So , my question is

1. What can I as an admin do , to avoid these costly splits ?

2. Should I try addressing this at table property level such as splitpolicy, compressions [Note: Even with constant split policy , i do see regions split even before hitting max hfile limit]

3. Is this addressed in HBase 2.0 or later ? Our capacity planning to add/reduce RS depends on the number of regions in the cluster.

Thanks in advance..!

Happy Hadooping :)


Don't have an account?
Coming from Hortonworks? Activate your account here