<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Why did the hdfs disk utilization skyrocketed from a few hundred GB to over 7TB in a matter of hours while ingesting data into  HBASE  by openTSDB? in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Why-did-the-hdfs-disk-utilization-skyrocketed-from-a-few/m-p/99892#M12966</link>
    <description>&lt;P&gt;This issue is tracked on &lt;A href="https://issues.apache.org/jira/browse/HBASE-16288" target="_blank"&gt;https://issues.apache.org/jira/browse/HBASE-16288&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Wed, 27 Jul 2016 03:21:58 GMT</pubDate>
    <dc:creator>ssingla</dc:creator>
    <dc:date>2016-07-27T03:21:58Z</dc:date>
    <item>
      <title>Why did the hdfs disk utilization skyrocketed from a few hundred GB to over 7TB in a matter of hours while ingesting data into  HBASE  by openTSDB?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Why-did-the-hdfs-disk-utilization-skyrocketed-from-a-few/m-p/99890#M12964</link>
      <description>&lt;P&gt;&lt;STRONG&gt;Background :&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Customer has an 8 Node cluster on AWS with ephemeral
storage, 5 of which are Hbase. &lt;/P&gt;&lt;P&gt;OpenTSDB and Grafana were installed on the cluster as
well.&lt;/P&gt;&lt;P&gt;Customer was ingesting time series data with OpenTSDB,
at a rate of ~50k records/second. &lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Symptom:  &lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;In a span of a couple of
hours, the disk utilization of hdfs skyrocketed from a few hundred GB to over
6TB all of it in HBASE / openTSDB.&lt;/P&gt;&lt;P&gt;In attempting to troubleshoot
– turning off all data ingest and stopping openTSDBand just running HBase
caused the disk utilization to continue to grow unabated and out of control
dozens of GB per minute, even when openTSDB was completely shut down.&lt;/P&gt;</description>
      <pubDate>Thu, 17 Dec 2015 21:41:44 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Why-did-the-hdfs-disk-utilization-skyrocketed-from-a-few/m-p/99890#M12964</guid>
      <dc:creator>adaher</dc:creator>
      <dc:date>2015-12-17T21:41:44Z</dc:date>
    </item>
    <item>
      <title>Re: Why did the hdfs disk utilization skyrocketed from a few hundred GB to over 7TB in a matter of hours while ingesting data into  HBASE  by openTSDB?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Why-did-the-hdfs-disk-utilization-skyrocketed-from-a-few/m-p/99891#M12965</link>
      <description>&lt;P&gt;&lt;STRONG&gt;Root
Cause -&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;1. Milliseconds were used in
opentsdb metrics, which may generate over 32000 metrics in one hour. each
column of milliseconds metrics uses 4 bytes, when compacting, the integrated
column size may exceed the 128KB (hfile.index.block.max.size) &lt;/P&gt;&lt;P&gt;2.If the size of ( rowkey +
columnfamily:qualifer ) is greater than hfile.index.block.max.size,
 this may cause the memstore flush to infinite loop during writing the
hfile index.&lt;/P&gt;&lt;P&gt;…&lt;/P&gt;&lt;P&gt;That's why the
compaction hangs, and the tmp folder of regions on hdfs increases all the time,
and makes the region server down.&lt;/P&gt;&lt;P&gt;When HBase is starting, it will
create that huge file in a .tmp directory in one of the subdirectories under
the tsdb directory.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Solution
-&lt;/STRONG&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;1.Shut down HBase. &lt;/LI&gt;&lt;LI&gt;2.Found  .tmp disk usage by
HBASE in the opentsdb keyspaces and deleted them completely.&lt;/LI&gt;&lt;/UL&gt;&lt;UL&gt;&lt;LI&gt;3.If the parent directory contains
a directory called recovered.edits, delete the recovered.edits directory or
rename it to something like recovered.edits.bak. &lt;/LI&gt;&lt;/UL&gt;&lt;UL&gt;&lt;LI&gt;4.Modified the hbase-site.xml in
Ambari and increased the size of the hfile.index.block.maxsize=1024kb (from
default of 128kb).&lt;/LI&gt;&lt;/UL&gt;&lt;UL&gt;&lt;LI&gt;5.Then restarted HBASE followed
by OpenTSDB.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;The cluster was immediately
stable (no more increasing disk space problem)  and old data could be seen
in openTSDB and viewed in grafana without issue.&lt;/P&gt;&lt;P&gt;Dataflow was turned on and
everything appears to be working normally again.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Further
Solution Details -&lt;/STRONG&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Here’s what openTSDB structure looked like
before solution was applied …&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;5.7
G  /apps/hbase/data/data/default/tsdb/08bfcc080d15d1127a0ebe664fdb1d80&lt;/P&gt;&lt;P&gt;5.0
G 
/apps/hbase/data/data/default/tsdb/0a55f4589b4e4bc9d7f71957a5795b4f&lt;/P&gt;&lt;P&gt;7.9
G 
/apps/hbase/data/data/default/tsdb/10066f9f83ac300e955ab9d0129ebf22&lt;/P&gt;&lt;P&gt;3.6
G  /apps/hbase/data/data/default/tsdb/1cbf2fbf04b1e276b3de3615b95dc68f&lt;/P&gt;&lt;P&gt;5.5
G 
/apps/hbase/data/data/default/tsdb/1ed939038b8902a9c391d6d6d5a519f4&lt;/P&gt;&lt;P&gt;9.2
G 
/apps/hbase/data/data/default/tsdb/25b4dadf6621b09a63d2b1b9401203b9&lt;/P&gt;&lt;P&gt;6.5
G  /apps/hbase/data/data/default/tsdb/2919b649b3b4a027ce8aece9a3e5ffd9&lt;/P&gt;&lt;P&gt;967.2 G 
/apps/hbase/data/data/default/tsdb/2bb4cdaaf052abe6eef753470303f099&lt;/P&gt;&lt;P&gt;4.6
G 
/apps/hbase/data/data/default/tsdb/39ab2524f8aaf2ee5685fc85a8dc1543&lt;/P&gt;&lt;P&gt;1022.7 M 
/apps/hbase/data/data/default/tsdb/39e3ce75e7225805534595c8c7e03305&lt;/P&gt;&lt;P&gt;4.1
G 
/apps/hbase/data/data/default/tsdb/4356d552eacfa526df24b400fd8007c7&lt;/P&gt;&lt;P&gt;9.9
G 
/apps/hbase/data/data/default/tsdb/4f2c1cc6d7c650c3f822136614921076&lt;/P&gt;&lt;P&gt;5.9
G 
/apps/hbase/data/data/default/tsdb/57eb8cbb2099bd6e1746cd4c8e007207&lt;/P&gt;&lt;P&gt;6.8
G  /apps/hbase/data/data/default/tsdb/5e26da2eacba074a132edefca38017a3&lt;/P&gt;&lt;P&gt;1.2
T 
/apps/hbase/data/data/default/tsdb/69fc543c6f94891d3071005294c3c116&lt;/P&gt;&lt;P&gt;4.1
G 
/apps/hbase/data/data/default/tsdb/6ccc256f9721216bc9afa29e7d056bd4&lt;/P&gt;&lt;P&gt;7.5
G  /apps/hbase/data/data/default/tsdb/6d43c524d221f3f54356e716c8f8849d&lt;/P&gt;&lt;P&gt;6.1
G 
/apps/hbase/data/data/default/tsdb/70f9f9ee045cde2823cf8ab485662a63&lt;/P&gt;&lt;P&gt;3.2
G 
/apps/hbase/data/data/default/tsdb/75f232ce81d5de2efb3f763d09d9c76f&lt;/P&gt;&lt;P&gt;8.7
G 
/apps/hbase/data/data/default/tsdb/7b4f9c05d64151d3f54558c70e5e9811&lt;/P&gt;&lt;P&gt;5.5
G 
/apps/hbase/data/data/default/tsdb/7fa9d913fd9a059733e9bb7a31b03e22&lt;/P&gt;&lt;P&gt;8.5
G 
/apps/hbase/data/data/default/tsdb/840f9f977262f1fbf9f4c06a8014c44b&lt;/P&gt;&lt;P&gt;3.0
G 
/apps/hbase/data/data/default/tsdb/9a3e7dbad294134eae934af980dd8c1c&lt;/P&gt;&lt;P&gt;6.5
G 
/apps/hbase/data/data/default/tsdb/a161383c9ae7df3c0cb15da093312908&lt;/P&gt;&lt;P&gt;4.0
G 
/apps/hbase/data/data/default/tsdb/b5e22e68e8f93ac50c9d9d3a3ca3f029&lt;/P&gt;&lt;P&gt;7.1
G 
/apps/hbase/data/data/default/tsdb/c9a9f105e8c4a44e8fd9172131b929d0&lt;/P&gt;&lt;P&gt;4.7
G  /apps/hbase/data/data/default/tsdb/cc27e8c5a020291b7b3ac010dc50e25b&lt;/P&gt;&lt;P&gt;5.6
G 
/apps/hbase/data/data/default/tsdb/cc7f6645bac92ec514f8545fdb39b617&lt;/P&gt;&lt;P&gt;9.3
G 
/apps/hbase/data/data/default/tsdb/ce6426bfac53fb06fe8d320f3de150ee&lt;/P&gt;&lt;P&gt;1.8
G  /apps/hbase/data/data/default/tsdb/d2ee226094556e6a90599e91bcba70f4&lt;/P&gt;&lt;P&gt;6.8
G 
/apps/hbase/data/data/default/tsdb/df6e88e27be5d3f0759d477812ab9277&lt;/P&gt;&lt;P&gt;3.0
G 
/apps/hbase/data/data/default/tsdb/efde9cca49c0f23a4e39e80e4040ac5a&lt;/P&gt;&lt;P&gt;5.9
G 
/apps/hbase/data/data/default/tsdb/f82790d90791b30aacd2bd990a1d4655&lt;/P&gt;&lt;P&gt;7.5
G 
/apps/hbase/data/data/default/tsdb/fcbb1b8f04f4e74a80882ef074244173&lt;/P&gt;&lt;P&gt;4.8
G 
/apps/hbase/data/data/default/tsdb/fece7565715c791028581022b70672e7&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Attacked the two largest offenders and
found that all of the space was in the ./tmp folder and we recovered all of the
lost disk space.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;hadoop fs -rm -R -skipTrash
/apps/hbase/data/data/default/tsdb/69fc543c6f94891d3071005294c3c116/.tmp&lt;/P&gt;&lt;P&gt;hadoop fs -rm -R -skipTrash
/apps/hbase/data/data/default/tsdb/69fc543c6f94891d3071005294c3c116/recovered.edits&lt;/P&gt;&lt;P&gt;hadoop fs -rm -R -skipTrash
/apps/hbase/data/data/default/tsdb/2bb4cdaaf052abe6eef753470303f099/.tmp&lt;/P&gt;&lt;P&gt;hadoop fs -rm -R -skipTrash
/apps/hbase/data/data/default/tsdb/2bb4cdaaf052abe6eef753470303f099/recovered.edits&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Then went into Ambari Configuration for HBASE and added
this setting:&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Which increases the default of hfile.index.block.maxsize from 128kb to 1024kb&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Will keep monitoring and will report any further anomalies.&lt;/LI&gt;&lt;/UL&gt;</description>
      <pubDate>Fri, 18 Dec 2015 00:22:09 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Why-did-the-hdfs-disk-utilization-skyrocketed-from-a-few/m-p/99891#M12965</guid>
      <dc:creator>adaher</dc:creator>
      <dc:date>2015-12-18T00:22:09Z</dc:date>
    </item>
    <item>
      <title>Re: Why did the hdfs disk utilization skyrocketed from a few hundred GB to over 7TB in a matter of hours while ingesting data into  HBASE  by openTSDB?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Why-did-the-hdfs-disk-utilization-skyrocketed-from-a-few/m-p/99892#M12966</link>
      <description>&lt;P&gt;This issue is tracked on &lt;A href="https://issues.apache.org/jira/browse/HBASE-16288" target="_blank"&gt;https://issues.apache.org/jira/browse/HBASE-16288&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 27 Jul 2016 03:21:58 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Why-did-the-hdfs-disk-utilization-skyrocketed-from-a-few/m-p/99892#M12966</guid>
      <dc:creator>ssingla</dc:creator>
      <dc:date>2016-07-27T03:21:58Z</dc:date>
    </item>
  </channel>
</rss>

