Created 02-03-2016 09:27 PM
Our table has one column family which has a TTL of 15day. So rows fall at a consistent basis. We are seeing that the no of regions are going up. Some how the regions are not getting re-used. We are currently with over 41k regions of which 19k are empty. Any pointers as to why regions are not getting reused. Our row key design is similar to what you mentioned. 2 digit salt (0 to 63) followed by hashcode of reverse time stamp in seconds, followed by a service id and finally a counter.
Our cluster is 41 nodes and we are writing rows at a rate of 4k to 6k Tps. The average row size is about 35kb.
Created 02-03-2016 09:42 PM
HBase regions are defined by a start and end rowkey. Only data that falls within that range is stored in that region.
I would guess that you are constantly inserting new data into the beginning of each "bucket" (defined by your 2 salt characters). Eventually, this region will get so large, that it will split into two daughters. One will contain the "top" half of the data and the other the "bottom". After that, all of the new data will be inserted into the "upper" daughter region. Then, the process repeats. (the same analogy works if you're appending data into the "buckets" instead of inserting at the head).
You likely will want to use the `merge_region` HBase shell command to merge away some of the older regions. You could also set the split threshold very high and prevent any future splits, but this would likely have a negative impact on performance (or you would need to drastically increase the number of buckets -- more salt digits -- to spread out the load evenly).
Created 02-03-2016 09:42 PM
HBase regions are defined by a start and end rowkey. Only data that falls within that range is stored in that region.
I would guess that you are constantly inserting new data into the beginning of each "bucket" (defined by your 2 salt characters). Eventually, this region will get so large, that it will split into two daughters. One will contain the "top" half of the data and the other the "bottom". After that, all of the new data will be inserted into the "upper" daughter region. Then, the process repeats. (the same analogy works if you're appending data into the "buckets" instead of inserting at the head).
You likely will want to use the `merge_region` HBase shell command to merge away some of the older regions. You could also set the split threshold very high and prevent any future splits, but this would likely have a negative impact on performance (or you would need to drastically increase the number of buckets -- more salt digits -- to spread out the load evenly).
Created 02-03-2016 10:14 PM
Merge 19k regions manually is a PTA @Josh Elser
Created 02-03-2016 10:25 PM
There is a new feature called "region normalizer" that runs inside master and can do the merges automatically. it is available (but turned off by default) in HDP-2.3+. However, it is a very new feature, so it is still in tech preview mode I would say.
Created 02-03-2016 10:26 PM
Yep looking forward to trying it @Enis
Created 02-04-2016 06:58 PM
Would there be a negative impact if the slat was say 5 digits?
How does increasing the split file size from 10GB to 40GB or more say 100G affect performance?
If you have 12 disks(4TB each) per node in a 40 node cluster and you want to store 400TB would you say (400*1024)/(40*10) ~ 1024 regions with 10GB file would be better or 103 regions with 100GB files.
Created 02-04-2016 08:02 PM
You should aim for having at most 10-20GB regions. Anything above that would cause compactions to increase the write-amplification and have <1000 regions per server.
I suggest looking into your data model and make necessary adjustments to make sure that you are not doing the anti-pattern of incremental writes. See https://hbase.apache.org/book.html#table_schema_rules_of_thumb.