<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: I'm getting RegionTooBusyException when trying to import data into hbase in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/I-m-getting-RegionTooBusyException-when-trying-to-import/m-p/115756#M78550</link>
    <description>&lt;P&gt;Is it possible to do the bulk import if the versions of hbase differ? The old cluster has hbase 0.94 while the new one has 1.1.2&lt;/P&gt;</description>
    <pubDate>Thu, 09 Jun 2016 20:36:05 GMT</pubDate>
    <dc:creator>kljuka</dc:creator>
    <dc:date>2016-06-09T20:36:05Z</dc:date>
    <item>
      <title>I'm getting RegionTooBusyException when trying to import data into hbase</title>
      <link>https://community.cloudera.com/t5/Support-Questions/I-m-getting-RegionTooBusyException-when-trying-to-import/m-p/115749#M78543</link>
      <description>&lt;P&gt;
	When I try to import data using the command:&lt;/P&gt;&lt;PRE&gt; hbase -Dhbase.import.version=0.94 'tablename' /hdfs/exported-data&lt;/PRE&gt;&lt;P&gt;After a while I get Region too busy exception. This is the mapreduce Job Explanation:&lt;/P&gt;&lt;BLOCKQUOTE&gt;Total maps: 14711
Complete: 526
Failed: 130
Kiled: 380
Successful: 526

&lt;/BLOCKQUOTE&gt;&lt;P&gt;What I see in the console is&lt;/P&gt;&lt;PRE&gt;2016-06-02 11:05:19,632 INFO  [main] mapreduce.Job: Task Id : attempt_1464792187762_0003_m_000347_0, Status : FAILED
Error: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 action: RegionTooBusyException: 1 time,
        at org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)

...&lt;/PRE&gt;&lt;P&gt;What is causing this? I suspect that the compactions of hbase might be making region servers unresponsive. How do I solve it?&lt;/P&gt;</description>
      <pubDate>Thu, 02 Jun 2016 16:25:47 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/I-m-getting-RegionTooBusyException-when-trying-to-import/m-p/115749#M78543</guid>
      <dc:creator>kljuka</dc:creator>
      <dc:date>2016-06-02T16:25:47Z</dc:date>
    </item>
    <item>
      <title>Re: I'm getting RegionTooBusyException when trying to import data into hbase</title>
      <link>https://community.cloudera.com/t5/Support-Questions/I-m-getting-RegionTooBusyException-when-trying-to-import/m-p/115750#M78544</link>
      <description>&lt;P&gt;Can you share the region server logs to check the reason why RegionTooBusyException was coming. In case if you feel major compaction is the reason then you can disable automatic major compactions by configuring below property.&lt;/P&gt;&lt;PRE&gt; &amp;lt;property&amp;gt;
&amp;lt;name&amp;gt;hbase.hregion.majorcompaction&amp;lt;/name&amp;gt;
&amp;lt;value&amp;gt;0&amp;lt;/value&amp;gt;
&amp;lt;description&amp;gt;The time (in miliseconds) between 'major' compactions of all
HStoreFiles in a region.  Default: 1 day.
Set to 0 to disable automated major compactions.
&amp;lt;/description&amp;gt;
&amp;lt;/property&amp;gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 02 Jun 2016 16:41:27 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/I-m-getting-RegionTooBusyException-when-trying-to-import/m-p/115750#M78544</guid>
      <dc:creator>rchintaguntla</dc:creator>
      <dc:date>2016-06-02T16:41:27Z</dc:date>
    </item>
    <item>
      <title>Re: I'm getting RegionTooBusyException when trying to import data into hbase</title>
      <link>https://community.cloudera.com/t5/Support-Questions/I-m-getting-RegionTooBusyException-when-trying-to-import/m-p/115751#M78545</link>
      <description>&lt;P&gt;Which version of HDP are you using ?&lt;/P&gt;&lt;P&gt;I am currently porting over this JIRA which would show us more information:&lt;/P&gt;&lt;P&gt;HBASE-15931 Add log for long-running tasks in AsyncProcess&lt;/P&gt;&lt;P&gt;How large is your region size ?&lt;/P&gt;&lt;P&gt;Did you monitor your region servers to see which ones were the hot spot during the import ?&lt;/P&gt;&lt;P&gt;Please pastebin more of the error / stack trace.&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Thu, 02 Jun 2016 22:52:43 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/I-m-getting-RegionTooBusyException-when-trying-to-import/m-p/115751#M78545</guid>
      <dc:creator>tyu</dc:creator>
      <dc:date>2016-06-02T22:52:43Z</dc:date>
    </item>
    <item>
      <title>Re: I'm getting RegionTooBusyException when trying to import data into hbase</title>
      <link>https://community.cloudera.com/t5/Support-Questions/I-m-getting-RegionTooBusyException-when-trying-to-import/m-p/115752#M78546</link>
      <description>&lt;P&gt;Please verify that regions of your table are evenly distributed across the servers.&lt;/P&gt;</description>
      <pubDate>Thu, 02 Jun 2016 22:53:32 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/I-m-getting-RegionTooBusyException-when-trying-to-import/m-p/115752#M78546</guid>
      <dc:creator>tyu</dc:creator>
      <dc:date>2016-06-02T22:53:32Z</dc:date>
    </item>
    <item>
      <title>Re: I'm getting RegionTooBusyException when trying to import data into hbase</title>
      <link>https://community.cloudera.com/t5/Support-Questions/I-m-getting-RegionTooBusyException-when-trying-to-import/m-p/115753#M78547</link>
      <description>&lt;P&gt;Two major reasons for &lt;STRONG&gt;RegionTooBusyException&lt;/STRONG&gt;&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;Failure to acquire region lock (look for "
failed to get a lock in
" in map task log)&lt;/LI&gt;&lt;LI&gt;Region memstore is above limit and flushes can not keep up with load (look for "Above memstore limit")&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;To mitigate 1. you can increase maximum busy wait timeout hbase.ipc.client.call.purge.timeout&lt;STRONG&gt; &lt;/STRONG&gt;in ms (default is 120000) directly but do not forget to increase &lt;STRONG&gt;hbase.rpc.timeout &lt;/STRONG&gt;accordingly (set it to the same value)&lt;/P&gt;&lt;P&gt;To mitigate 2. you can increase &lt;STRONG&gt;hbase.hregion.memstore.block.multiplier &lt;/STRONG&gt;from default(4) to some higher value.&lt;/P&gt;&lt;P&gt;But the best option for you use bulk import option:&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt; -Dimport.bulk.output=/path/for/output&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;followed by&lt;STRONG&gt; completebulkload tool&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;
&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;See: &lt;/STRONG&gt;&lt;A href="https://hbase.apache.org/book.html#arch.bulk.load.complete" target="_blank"&gt;https://hbase.apache.org/book.html#arch.bulk.load.complete&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 03 Jun 2016 02:54:43 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/I-m-getting-RegionTooBusyException-when-trying-to-import/m-p/115753#M78547</guid>
      <dc:creator>vrodionov</dc:creator>
      <dc:date>2016-06-03T02:54:43Z</dc:date>
    </item>
    <item>
      <title>Re: I'm getting RegionTooBusyException when trying to import data into hbase</title>
      <link>https://community.cloudera.com/t5/Support-Questions/I-m-getting-RegionTooBusyException-when-trying-to-import/m-p/115754#M78548</link>
      <description>&lt;P&gt;Totally agree re bulk import. One additional point. You need to ensure the &lt;STRONG&gt;hbase&lt;/STRONG&gt; user has access to read/write the files created by the &lt;STRONG&gt;-Dimport.bulk.output&lt;/STRONG&gt; step. If it doesn't, the &lt;STRONG&gt;completebulkload&lt;/STRONG&gt; step will appear to hang.&lt;/P&gt;&lt;P&gt;The simplest way to achieve this is to do:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;hdfs dfs -chmod -R 777 &amp;lt;dir containing export files&amp;gt;&lt;/PRE&gt;as the owner of those files. &lt;STRONG&gt;completebulkload&lt;/STRONG&gt;, running as &lt;STRONG&gt;hbase&lt;/STRONG&gt;, simply moves these to the relevant HBase directories. With the permnissions correctly set, this takes fractions of a second.</description>
      <pubDate>Fri, 03 Jun 2016 07:43:52 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/I-m-getting-RegionTooBusyException-when-trying-to-import/m-p/115754#M78548</guid>
      <dc:creator>StevenONeill</dc:creator>
      <dc:date>2016-06-03T07:43:52Z</dc:date>
    </item>
    <item>
      <title>Re: I'm getting RegionTooBusyException when trying to import data into hbase</title>
      <link>https://community.cloudera.com/t5/Support-Questions/I-m-getting-RegionTooBusyException-when-trying-to-import/m-p/115755#M78549</link>
      <description>&lt;P&gt;It wasn't the problem with compactions but the &lt;STRONG&gt;number of map jobs&lt;/STRONG&gt;.&lt;/P&gt;&lt;P&gt;The solution was to change the YARN scheduler from Memory (default) to CPU:
inside Ambari interface (I'm using Apache Ambari 2.2.2.0): YARN -&amp;gt; Configs -&amp;gt; Enable CPU Node Scheduling.&lt;/P&gt;&lt;P&gt;It's also possible to find that setting in Hadoop's capacity-scheduler.xml&lt;/P&gt;&lt;PRE&gt; &amp;lt;property&amp;gt;
    &amp;lt;name&amp;gt;yarn.scheduler.capacity.resource-calculator&amp;lt;/name&amp;gt;
    &amp;lt;value&amp;gt;org.apache.hadoop.yarn.util.resource.DominantResourceCalculator&amp;lt;/value&amp;gt;
  &amp;lt;/property&amp;gt;
&lt;/PRE&gt;&lt;P&gt;What really happened:&lt;/P&gt;&lt;P&gt;The cluster consists of 20 nodes that have more than 1TB of RAM.
YARN has 800GB of RAM available for the jobs. Because YARN uses memory for calculation of number of containers it assigned about 320 containers for map jobs (800GB / 2,5GB(per MapReduce2 job) = 320 jobs!!!). This was like flooding our own servers with processes and requests.&lt;/P&gt;&lt;P&gt;When changing that to CPU YARN capacity scheduling it changed it's formula for number of containers to 20nodes * 6Virtual_Cores = 120 processes which is much more manageable (and works fine for now). &lt;/P&gt;</description>
      <pubDate>Thu, 09 Jun 2016 19:47:04 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/I-m-getting-RegionTooBusyException-when-trying-to-import/m-p/115755#M78549</guid>
      <dc:creator>kljuka</dc:creator>
      <dc:date>2016-06-09T19:47:04Z</dc:date>
    </item>
    <item>
      <title>Re: I'm getting RegionTooBusyException when trying to import data into hbase</title>
      <link>https://community.cloudera.com/t5/Support-Questions/I-m-getting-RegionTooBusyException-when-trying-to-import/m-p/115756#M78550</link>
      <description>&lt;P&gt;Is it possible to do the bulk import if the versions of hbase differ? The old cluster has hbase 0.94 while the new one has 1.1.2&lt;/P&gt;</description>
      <pubDate>Thu, 09 Jun 2016 20:36:05 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/I-m-getting-RegionTooBusyException-when-trying-to-import/m-p/115756#M78550</guid>
      <dc:creator>kljuka</dc:creator>
      <dc:date>2016-06-09T20:36:05Z</dc:date>
    </item>
    <item>
      <title>Re: I'm getting RegionTooBusyException when trying to import data into hbase</title>
      <link>https://community.cloudera.com/t5/Support-Questions/I-m-getting-RegionTooBusyException-when-trying-to-import/m-p/115757#M78551</link>
      <description>&lt;P&gt;I believe so, yes. The &lt;STRONG&gt;-Dimport.bulk.output&lt;/STRONG&gt; can be performed on the target. This will prep the HBase files according to the target version/number of region servers/etc.&lt;/P&gt;</description>
      <pubDate>Fri, 10 Jun 2016 05:03:01 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/I-m-getting-RegionTooBusyException-when-trying-to-import/m-p/115757#M78551</guid>
      <dc:creator>StevenONeill</dc:creator>
      <dc:date>2016-06-10T05:03:01Z</dc:date>
    </item>
    <item>
      <title>Re: I'm getting RegionTooBusyException when trying to import data into hbase</title>
      <link>https://community.cloudera.com/t5/Support-Questions/I-m-getting-RegionTooBusyException-when-trying-to-import/m-p/115758#M78552</link>
      <description>&lt;P&gt;I have hit the exact same problem before and it took me a long time to solve it.&lt;/P&gt;&lt;P&gt;Basically this error means the Hbase region server is overloaded due to too many parallel writing threads.&lt;/P&gt;&lt;P&gt;Also bulk load can cause memstore to saturate. Since hbase does not have a good back pressure, applications that writes into hbase need to control QPS.&lt;/P&gt;&lt;P&gt;In my scenario, I was using Spark bulk load to write into hbase, and caused hbase region server loaded.&lt;/P&gt;&lt;P&gt;There are a few ways that can potentially solve this problem:&lt;/P&gt;&lt;P&gt;1. Pre-split the hbase table so multiple region servers can handle writes&lt;/P&gt;&lt;P&gt;2. Tune down the RDD partitions in spark right before calling bulk load. This can reduce the parallel writer threads from spark executors&lt;/P&gt;</description>
      <pubDate>Wed, 22 Mar 2017 23:26:06 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/I-m-getting-RegionTooBusyException-when-trying-to-import/m-p/115758#M78552</guid>
      <dc:creator>linehrr</dc:creator>
      <dc:date>2017-03-22T23:26:06Z</dc:date>
    </item>
    <item>
      <title>Re: I'm getting RegionTooBusyException when trying to import data into hbase</title>
      <link>https://community.cloudera.com/t5/Support-Questions/I-m-getting-RegionTooBusyException-when-trying-to-import/m-p/115759#M78553</link>
      <description>&lt;P&gt;I have faced exact issue, when trying to import around 2 TB data into HBase. &lt;/P&gt;&lt;P&gt;There are following ways which can solve the issue.&lt;/P&gt;&lt;P&gt;1. Increase &lt;STRONG&gt;hbase.hregion.memstore.block.multiplier&lt;/STRONG&gt; = 8&lt;/P&gt;&lt;P&gt;2. increase % of RegionServer Allocated to Write Buffers from 40% to 60%.&lt;/P&gt;&lt;P&gt;3. Pre Split Hbase Table using start key of same Table that's might be exist on another cluster using below command.&lt;/P&gt;&lt;P&gt;create '&amp;lt;HbaseTableName&amp;gt;',{ NAME =&amp;gt; '&amp;lt;ColumnFamily&amp;gt;', COMPRESSION =&amp;gt; '&amp;lt;Compression&amp;gt;'}, SPLITS=&amp;gt; ['&amp;lt;startkey1&amp;gt;','&amp;lt;startkey2&amp;gt;','&amp;lt;startkey3&amp;gt;','&amp;lt;startkey4&amp;gt;','&amp;lt;startkey5&amp;gt;']&lt;/P&gt;&lt;P&gt;Hbase Presplit enable multiple region servers can handle writes concurrently. &lt;/P&gt;&lt;P&gt;Note: Basically this issue is appearing due to bulk write to Hbase.&lt;/P&gt;</description>
      <pubDate>Wed, 29 May 2019 20:25:27 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/I-m-getting-RegionTooBusyException-when-trying-to-import/m-p/115759#M78553</guid>
      <dc:creator>rajeshujjain05</dc:creator>
      <dc:date>2019-05-29T20:25:27Z</dc:date>
    </item>
    <item>
      <title>Re: I'm getting RegionTooBusyException when trying to import data into hbase</title>
      <link>https://community.cloudera.com/t5/Support-Questions/I-m-getting-RegionTooBusyException-when-trying-to-import/m-p/115760#M78554</link>
      <description>&lt;P&gt;Really helpful. Worked for my production system.&lt;/P&gt;</description>
      <pubDate>Fri, 31 May 2019 15:27:21 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/I-m-getting-RegionTooBusyException-when-trying-to-import/m-p/115760#M78554</guid>
      <dc:creator>Prachi</dc:creator>
      <dc:date>2019-05-31T15:27:21Z</dc:date>
    </item>
  </channel>
</rss>

