Reply
GGJ
New Contributor
Posts: 3
Registered: ‎03-02-2015

What is relation between hbase region size and read/write latencies?

I have read online (http://hbase.apache.org/0.94/book/important_configurations.html#bigger.regions) that having too many regions is also cause of poor latencies. Has anyone seen this?

- Per my understanding, hbase.hregion.max.filesize decides how large are the regions and smaller this is set to, larger will be the number of regions as region splitting will be more frequent. Is this setting used only on region server or also on master server? What does hbase master server used hbase.hregion.max.filesize for?

- The hbase documentation points to an online merge utility(online_merge.rb) attached to (https://issues.apache.org/jira/browse/HBASE-1621) to do the online merges. Does anyone have experience in using this tool?

Posts: 1,894
Kudos: 433
Solutions: 303
Registered: ‎07-31-2013

Re: What is relation between hbase region size and read/write latencies?

that having too many regions is also cause of poor latencies. Has anyone seen this?

 

You could go slower with too many regions as a result of more # of connection overheads required to find/scan data across the table. But smaller regions can also do faster random reads. So the first question should be: Latency of what? Scans? Gets? Puts?

 

Is this setting used only on region server or also on master server?

 

Its used only by the RS (splits happen at RSes). The master does load the property to sanity-check table descriptors though, but does not actually use the values to work on splits.

 

Does anyone have experience in using this tool?

 

CDH5 has a more direct command you can use: http://archive.cloudera.com/cdh5/cdh/5/hbase/book.html#_online_region_merges

Announcements