<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: how to compress the hdfs data using zlib compression ?? in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/how-to-compress-the-hdfs-data-using-zlib-compression/m-p/171426#M133723</link>
    <description>&lt;A rel="user" href="https://community.cloudera.com/users/1852/subhashparise36.html" nodeid="1852"&gt;@subhash parise&lt;/A&gt;&lt;P&gt;Here is the link for more information:&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.hortonworks.com/content/kbentry/49252/performance-comparison-bw-orc-snappy-and-zlib-in-h.html" target="_blank"&gt;https://community.hortonworks.com/content/kbentry/49252/performance-comparison-bw-orc-snappy-and-zlib-in-h.html&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Wed, 10 Aug 2016 21:13:19 GMT</pubDate>
    <dc:creator>divakarreddy_a</dc:creator>
    <dc:date>2016-08-10T21:13:19Z</dc:date>
    <item>
      <title>how to compress the hdfs data using zlib compression ??</title>
      <link>https://community.cloudera.com/t5/Support-Questions/how-to-compress-the-hdfs-data-using-zlib-compression/m-p/171422#M133719</link>
      <description>&lt;P&gt;Hi Community team,&lt;/P&gt;&lt;P&gt;Any one can you help me  how to enable zlib compression in hdp.2.4.2.&lt;/P&gt;&lt;P&gt;Thanks in advance&lt;/P&gt;</description>
      <pubDate>Wed, 10 Aug 2016 16:47:31 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/how-to-compress-the-hdfs-data-using-zlib-compression/m-p/171422#M133719</guid>
      <dc:creator>subhash_parise3</dc:creator>
      <dc:date>2016-08-10T16:47:31Z</dc:date>
    </item>
    <item>
      <title>Re: how to compress the hdfs data using zlib compression ??</title>
      <link>https://community.cloudera.com/t5/Support-Questions/how-to-compress-the-hdfs-data-using-zlib-compression/m-p/171423#M133720</link>
      <description>&lt;P&gt;Take a look at this article, it has ways of setting compression, including zlib in Hive. &lt;A href="http://hortonworks.com/blog/orcfile-in-hdp-2-better-compression-better-performance/" target="_blank"&gt;http://hortonworks.com/blog/orcfile-in-hdp-2-better-compression-better-performance/&lt;/A&gt;&lt;/P&gt;&lt;P&gt;It will help if you specify which product specifically you're trying to enable zlib for. Since you categorized the question in data ingestion, I will assume it's for Sqoop, here's an example how to Sqoop using compression, just replace snappy codec class with zlib &lt;A href="https://community.hortonworks.com/questions/29648/sqoop-import-to-hive-with-compression.html" target="_blank"&gt;https://community.hortonworks.com/questions/29648/sqoop-import-to-hive-with-compression.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 10 Aug 2016 19:12:46 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/how-to-compress-the-hdfs-data-using-zlib-compression/m-p/171423#M133720</guid>
      <dc:creator>aervits</dc:creator>
      <dc:date>2016-08-10T19:12:46Z</dc:date>
    </item>
    <item>
      <title>Re: how to compress the hdfs data using zlib compression ??</title>
      <link>https://community.cloudera.com/t5/Support-Questions/how-to-compress-the-hdfs-data-using-zlib-compression/m-p/171424#M133721</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/393/aervits.html" nodeid="393"&gt;@Artem Ervits&lt;/A&gt;&lt;P&gt;:Thank you for replying my question . I am looking for zlib compression in hdfs level.&lt;/P&gt;</description>
      <pubDate>Wed, 10 Aug 2016 20:59:39 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/how-to-compress-the-hdfs-data-using-zlib-compression/m-p/171424#M133721</guid>
      <dc:creator>subhash_parise3</dc:creator>
      <dc:date>2016-08-10T20:59:39Z</dc:date>
    </item>
    <item>
      <title>Re: how to compress the hdfs data using zlib compression ??</title>
      <link>https://community.cloudera.com/t5/Support-Questions/how-to-compress-the-hdfs-data-using-zlib-compression/m-p/171425#M133722</link>
      <description>&lt;P&gt;
	&lt;A rel="user" href="https://community.cloudera.com/users/1852/subhashparise36.html" nodeid="1852"&gt;@subhash parise&lt;/A&gt;&lt;/P&gt;&lt;P&gt;As &lt;A rel="user" href="https://community.cloudera.com/users/393/aervits.html" nodeid="393"&gt;@Artem Ervits&lt;/A&gt; shared, you get compression when storing your data in ORC format.  However, if you want to store "raw" data on HDFS and you want to selectively compress it, you can use a simple PIG script to do it.  Load the data from HDFS and then write it out again.&lt;/P&gt;&lt;PRE&gt;set output.compression.enabled true;
set output.compression.codec org.apache.hadoop.io.compress.BZip2Codec;

inputFiles = LOAD '/input/directory/uncompressed' using PigStorage();
STORE inputFiles INTO '/output/directory/compressed/' USING PigStorage();&lt;/PRE&gt;&lt;P&gt;You can either leave the uncompressed data or remove it, depending on what you are doing.  This is an approach that I've used.&lt;/P&gt;&lt;P&gt;You can use different codecs depending on your needs:&lt;/P&gt;&lt;PRE&gt;set output.compression.codec com.hadoop.compression.lzo.LzopCodec;
set output.compression.codec org.apache.hadoop.io.compress.GzipCodec;
set output.compression.codec org.apache.hadoop.io.compress.BZip2Codec;
&lt;/PRE&gt;</description>
      <pubDate>Wed, 10 Aug 2016 21:11:46 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/how-to-compress-the-hdfs-data-using-zlib-compression/m-p/171425#M133722</guid>
      <dc:creator>myoung</dc:creator>
      <dc:date>2016-08-10T21:11:46Z</dc:date>
    </item>
    <item>
      <title>Re: how to compress the hdfs data using zlib compression ??</title>
      <link>https://community.cloudera.com/t5/Support-Questions/how-to-compress-the-hdfs-data-using-zlib-compression/m-p/171426#M133723</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/1852/subhashparise36.html" nodeid="1852"&gt;@subhash parise&lt;/A&gt;&lt;P&gt;Here is the link for more information:&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.hortonworks.com/content/kbentry/49252/performance-comparison-bw-orc-snappy-and-zlib-in-h.html" target="_blank"&gt;https://community.hortonworks.com/content/kbentry/49252/performance-comparison-bw-orc-snappy-and-zlib-in-h.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 10 Aug 2016 21:13:19 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/how-to-compress-the-hdfs-data-using-zlib-compression/m-p/171426#M133723</guid>
      <dc:creator>divakarreddy_a</dc:creator>
      <dc:date>2016-08-10T21:13:19Z</dc:date>
    </item>
    <item>
      <title>Re: how to compress the hdfs data using zlib compression ??</title>
      <link>https://community.cloudera.com/t5/Support-Questions/how-to-compress-the-hdfs-data-using-zlib-compression/m-p/171427#M133724</link>
      <description>&lt;P&gt;Sample hive script:&lt;/P&gt;&lt;P&gt;CREATE EXTERNAL TABLE test.temp3 &lt;/P&gt;&lt;P&gt;( &lt;/P&gt;&lt;P&gt;cat_0                   bigint, &lt;/P&gt;&lt;P&gt;cat_1                   bigint, &lt;/P&gt;&lt;P&gt;cat_2                   bigint, &lt;/P&gt;&lt;P&gt;cat_3                   bigint, &lt;/P&gt;&lt;P&gt;cat_4                   bigint, &lt;/P&gt;&lt;P&gt;cat_5                   bigint, &lt;/P&gt;&lt;P&gt;cat_6                   bigint, &lt;/P&gt;&lt;P&gt;cat_7                   bigint, &lt;/P&gt;&lt;P&gt;cat_8                   bigint, &lt;/P&gt;&lt;P&gt;cat_9                   bigint &lt;/P&gt;&lt;P&gt;) &lt;/P&gt;&lt;P&gt;row format delimited fields terminated by ',' &lt;/P&gt;&lt;P&gt;stored as ORC
location '/test/' &lt;/P&gt;&lt;P&gt;tblproperties ("orc.compress"="ZLIB");&lt;/P&gt;</description>
      <pubDate>Wed, 10 Aug 2016 21:19:46 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/how-to-compress-the-hdfs-data-using-zlib-compression/m-p/171427#M133724</guid>
      <dc:creator>divakarreddy_a</dc:creator>
      <dc:date>2016-08-10T21:19:46Z</dc:date>
    </item>
    <item>
      <title>Re: how to compress the hdfs data using zlib compression ??</title>
      <link>https://community.cloudera.com/t5/Support-Questions/how-to-compress-the-hdfs-data-using-zlib-compression/m-p/171428#M133725</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/2695/myoung.html" nodeid="2695"&gt;@Michael Young&lt;/A&gt;: could you please give me the syntax to set compression codec for zlib codec ??&lt;/P&gt;</description>
      <pubDate>Wed, 10 Aug 2016 22:05:18 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/how-to-compress-the-hdfs-data-using-zlib-compression/m-p/171428#M133725</guid>
      <dc:creator>subhash_parise3</dc:creator>
      <dc:date>2016-08-10T22:05:18Z</dc:date>
    </item>
    <item>
      <title>Re: how to compress the hdfs data using zlib compression ??</title>
      <link>https://community.cloudera.com/t5/Support-Questions/how-to-compress-the-hdfs-data-using-zlib-compression/m-p/171429#M133726</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/2348/divakarreddya.html" nodeid="2348"&gt;@Divakar Annapureddy&lt;/A&gt;: Thank you for replying my question. my case is a bit different. i need zlib codec for hdfs data(hadoop files)&lt;/P&gt;</description>
      <pubDate>Wed, 10 Aug 2016 22:07:27 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/how-to-compress-the-hdfs-data-using-zlib-compression/m-p/171429#M133726</guid>
      <dc:creator>subhash_parise3</dc:creator>
      <dc:date>2016-08-10T22:07:27Z</dc:date>
    </item>
    <item>
      <title>Re: how to compress the hdfs data using zlib compression ??</title>
      <link>https://community.cloudera.com/t5/Support-Questions/how-to-compress-the-hdfs-data-using-zlib-compression/m-p/171430#M133727</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/1852/subhashparise36.html" nodeid="1852"&gt;@subhash parise&lt;/A&gt; &lt;/P&gt;&lt;P&gt;The default codec is zlib.  If you want to explicitly set it to zlib, use the following:&lt;/P&gt;&lt;PRE&gt;set output.compression.codec org.apache.hadoop.io.compress.DefaultCodec;&lt;/PRE&gt;</description>
      <pubDate>Wed, 10 Aug 2016 22:38:53 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/how-to-compress-the-hdfs-data-using-zlib-compression/m-p/171430#M133727</guid>
      <dc:creator>myoung</dc:creator>
      <dc:date>2016-08-10T22:38:53Z</dc:date>
    </item>
    <item>
      <title>Re: how to compress the hdfs data using zlib compression ??</title>
      <link>https://community.cloudera.com/t5/Support-Questions/how-to-compress-the-hdfs-data-using-zlib-compression/m-p/171431#M133728</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/1852/subhashparise36.html" nodeid="1852"&gt;@subhash parise&lt;/A&gt; &lt;/P&gt;&lt;P&gt;I just posted an article demonstrating a very simple Pig + Hive example showing HDFS compression.&lt;/P&gt;&lt;P&gt;&lt;A target="_blank" href="https://community.hortonworks.com/content/kbentry/50921/using-pig-to-convert-uncompressed-data-to-compress.html"&gt;https://community.hortonworks.com/content/kbentry/50921/using-pig-to-convert-uncompressed-data-to-compress.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 11 Aug 2016 21:06:21 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/how-to-compress-the-hdfs-data-using-zlib-compression/m-p/171431#M133728</guid>
      <dc:creator>myoung</dc:creator>
      <dc:date>2016-08-11T21:06:21Z</dc:date>
    </item>
  </channel>
</rss>

