<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Total region count in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Total-region-count/m-p/123335#M86079</link>
    <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/133/ledel.html" nodeid="133"&gt;@Laurent Edel&lt;/A&gt; - Thanks, I did not think about the fact that splitting does not always create two 10G regions.&lt;/P&gt;&lt;P&gt;I am using hbase 0.98. So, if I were to set ConstantSizeRegionSplitPolicy through hbase shell, then I can assume them to always be 10G in size? &lt;/P&gt;</description>
    <pubDate>Sat, 30 Apr 2016 14:02:12 GMT</pubDate>
    <dc:creator>sumit_nigam</dc:creator>
    <dc:date>2016-04-30T14:02:12Z</dc:date>
    <item>
      <title>Total region count</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Total-region-count/m-p/123331#M86075</link>
      <description>&lt;P&gt;I notice following line in my region server logs -
2016-04-27 12:11:11,924 WARN 
[MemStoreFlusher.1] regionserver.CompactSplitThread: Total number of
regions is approaching the upper limit 1000. Please consider taking a look at
&lt;A href="http://hbase.apache.org/book.html#ops.regionmgt" target="_blank"&gt;http://hbase.apache.org/book.html#ops.regionmgt&lt;/A&gt;&lt;/P&gt;&lt;P&gt;And also -&lt;/P&gt;&lt;P&gt;2016-04-27 16:31:47,799 INFO 
[regionserver54130] regionserver.HRegionServer: Waiting on &lt;STRONG&gt;4007&lt;/STRONG&gt; regions to close&lt;/P&gt;&lt;P&gt;This is surprising because I do not have as much data. Given the default value of hbase.hregion.max.filesize is 10G, this would imply 40TB of data. That is not even the size of my disks put together. &lt;/P&gt;&lt;P&gt;Does this mean there are many empty regions getting created? If so, why? Is there any performance implication to carrying these empty regions around? Definitely, one of them is that so many file descriptors are used up? Can I get rid of them?&lt;/P&gt;</description>
      <pubDate>Sat, 30 Apr 2016 00:04:49 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Total-region-count/m-p/123331#M86075</guid>
      <dc:creator>sumit_nigam</dc:creator>
      <dc:date>2016-04-30T00:04:49Z</dc:date>
    </item>
    <item>
      <title>Re: Total region count</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Total-region-count/m-p/123332#M86076</link>
      <description>&lt;P&gt;You can check how many regions you have from the HBase master's web UI. A good rule of thumb is to keep number of regions per regionserver to be under 1000. You can also inspect the start and end keys of regions and regions sizes from the Master web UI or by going in to a RegionServers WebUI and checking various tabs. &lt;/P&gt;&lt;P&gt;HBase splits the regions based on range boundaries of the keyspace. HBase ALWAYS does range-splitting, not hash-based splitting. This means that depending on your key design, you maybe temporarily hotspotting some parts of the keyspace causing excessive region splits. It is likely that you have a timeseries based key design that you have to revisit. You can check out the HBase book, and there are also presentations available that talks about row key and schema design. &lt;/P&gt;</description>
      <pubDate>Sat, 30 Apr 2016 01:26:07 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Total-region-count/m-p/123332#M86076</guid>
      <dc:creator>Enis</dc:creator>
      <dc:date>2016-04-30T01:26:07Z</dc:date>
    </item>
    <item>
      <title>Re: Total region count</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Total-region-count/m-p/123333#M86077</link>
      <description>&lt;P&gt;let's add that 10GB sizes are not fixed sizes since the default algorithm used is IncreasingToUpperBoundRegionSplitPolicy and not ConstantSizeRegionSplitPolicy (but you can set the latest by altering table from HBase shell for example).
This means that you can't have an estimate of HDFS size by doing simple maths from regions number and region size parameter.&lt;/P&gt;&lt;P&gt;Regardless of the policy being used, a 10GB region which just split doesn't gives you two 10GB regions.&lt;/P&gt;</description>
      <pubDate>Sat, 30 Apr 2016 03:35:23 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Total-region-count/m-p/123333#M86077</guid>
      <dc:creator>ledel</dc:creator>
      <dc:date>2016-04-30T03:35:23Z</dc:date>
    </item>
    <item>
      <title>Re: Total region count</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Total-region-count/m-p/123334#M86078</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/372/enis.html" nodeid="372"&gt;@Enis&lt;/A&gt; - I have salted reowkeys so am hopeful that the region servers should not hotspot. &lt;/P&gt;</description>
      <pubDate>Sat, 30 Apr 2016 14:00:02 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Total-region-count/m-p/123334#M86078</guid>
      <dc:creator>sumit_nigam</dc:creator>
      <dc:date>2016-04-30T14:00:02Z</dc:date>
    </item>
    <item>
      <title>Re: Total region count</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Total-region-count/m-p/123335#M86079</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/133/ledel.html" nodeid="133"&gt;@Laurent Edel&lt;/A&gt; - Thanks, I did not think about the fact that splitting does not always create two 10G regions.&lt;/P&gt;&lt;P&gt;I am using hbase 0.98. So, if I were to set ConstantSizeRegionSplitPolicy through hbase shell, then I can assume them to always be 10G in size? &lt;/P&gt;</description>
      <pubDate>Sat, 30 Apr 2016 14:02:12 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Total-region-count/m-p/123335#M86079</guid>
      <dc:creator>sumit_nigam</dc:creator>
      <dc:date>2016-04-30T14:02:12Z</dc:date>
    </item>
    <item>
      <title>Re: Total region count</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Total-region-count/m-p/123336#M86080</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/5605/sumitnigam.html" nodeid="5605"&gt;@Sumit Nigam&lt;/A&gt; yes, well, it will split on 10G size. That means you won't have 10G regions at the end, that just means that regions will split when reaching 10G size...&lt;/P&gt;</description>
      <pubDate>Mon, 02 May 2016 14:41:10 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Total-region-count/m-p/123336#M86080</guid>
      <dc:creator>ledel</dc:creator>
      <dc:date>2016-05-02T14:41:10Z</dc:date>
    </item>
  </channel>
</rss>

