<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Hbase utilizing more storage space while loading data from oracle using sqoop in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hbase-utilizing-more-storage-space-while-loading-data-from/m-p/31241#M6828</link>
    <description>&lt;P&gt;Thanks . Block encoding and compression together helped to storage utilization.&lt;/P&gt;</description>
    <pubDate>Wed, 26 Aug 2015 07:05:28 GMT</pubDate>
    <dc:creator>Ashokevarma</dc:creator>
    <dc:date>2015-08-26T07:05:28Z</dc:date>
    <item>
      <title>Hbase utilizing more storage space while loading data from oracle using sqoop</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hbase-utilizing-more-storage-space-while-loading-data-from/m-p/30203#M6826</link>
      <description>&lt;P&gt;HI ,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Need your expertise to understand the utilization of storage space in hbase .&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am trying to load data from Oracle Table to Hbase directly using Sqoop by below command.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Source Table Size : 20 GB &amp;nbsp;.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;sqoop import --connect 'jdbc:oracle:thin:@(description=(address=(protocol=tcp)(host=)(port=))(connect_data=(sid=ORA10G)))' --username &amp;nbsp;--password &amp;nbsp;--query "SELECT /*+ parallel(a,8) full(a) */ * FROM TEST a &amp;nbsp;WHERE \$CONDITIONS" &amp;nbsp;-m 10 &amp;nbsp;--hbase-create-table --hbase-table TEST_HB --column-family cf1 --hbase-row-key IDNUM &amp;nbsp;--hive-drop-import-delims --split-by PARTITION_ID&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Facing two below issues .&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;1) its starting 10 Mappers , but only 3 were in Running status and remaining as scheduled . Basically only 3 were running . &amp;nbsp;Do we have some parameters that limits this mappers while loading in &amp;nbsp;hbase ?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;2) &amp;nbsp;HDFS storage getting filled more than 180 GB for 20GB worth oracle database . It should not be more than 60 GB worth ( considering 3 replication factor ) . &amp;nbsp;when i checked i physicals block files in hdfs for this data , all rows are storing with column names . How to avoid this overhead of column names or iam missing something in above sqoop command .&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 09:36:20 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hbase-utilizing-more-storage-space-while-loading-data-from/m-p/30203#M6826</guid>
      <dc:creator>Ashokevarma</dc:creator>
      <dc:date>2022-09-16T09:36:20Z</dc:date>
    </item>
    <item>
      <title>Re: Hbase utilizing more storage space while loading data from oracle using sqoop</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hbase-utilizing-more-storage-space-while-loading-data-from/m-p/31240#M6827</link>
      <description>You can use the FAST_DIFF encoding to perhaps, in one way, reduce the serialisation cost in HBase: &lt;A href="http://archive.cloudera.com/cdh5/cdh/5/hbase/book.html#data.block.encoding.enable" target="_blank"&gt;http://archive.cloudera.com/cdh5/cdh/5/hbase/book.html#data.block.encoding.enable&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;Also consider compressing your table - it will save a lot of space if you also make sure to use a proper HFile data block size (not the same as HDFS block size).</description>
      <pubDate>Wed, 26 Aug 2015 06:50:03 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hbase-utilizing-more-storage-space-while-loading-data-from/m-p/31240#M6827</guid>
      <dc:creator>Harsh J</dc:creator>
      <dc:date>2015-08-26T06:50:03Z</dc:date>
    </item>
    <item>
      <title>Re: Hbase utilizing more storage space while loading data from oracle using sqoop</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hbase-utilizing-more-storage-space-while-loading-data-from/m-p/31241#M6828</link>
      <description>&lt;P&gt;Thanks . Block encoding and compression together helped to storage utilization.&lt;/P&gt;</description>
      <pubDate>Wed, 26 Aug 2015 07:05:28 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hbase-utilizing-more-storage-space-while-loading-data-from/m-p/31241#M6828</guid>
      <dc:creator>Ashokevarma</dc:creator>
      <dc:date>2015-08-26T07:05:28Z</dc:date>
    </item>
  </channel>
</rss>

