<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Lz0 is enabled now what? in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Lz0-is-enabled-now-what/m-p/143824#M106410</link>
    <description>&lt;P&gt;What I see is:&lt;/P&gt;&lt;P&gt;/usr/hdp/2.2.6.0-2800/knox/samples/hadoop-examples.jar
/usr/hdp/2.4.0.0-169/knox/samples/&lt;STRONG&gt;hadoop-examples.jar &lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;/STRONG&gt;/usr/hdp/2.4.2.0-258/knox/samples/hadoop-examples.jar
/usr/lib/hue/apps/jobsub/data/examples/&lt;STRONG&gt;hadoop-examples.jar&lt;/STRONG&gt; &lt;/P&gt;&lt;P&gt;/usr/lib/hue/apps/oozie/examples/lib/&lt;STRONG&gt;hadoop-examples.jar&lt;/STRONG&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 29 Sep 2016 22:44:33 GMT</pubDate>
    <dc:creator>Eric_Periard</dc:creator>
    <dc:date>2016-09-29T22:44:33Z</dc:date>
    <item>
      <title>Lz0 is enabled now what?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Lz0-is-enabled-now-what/m-p/143818#M106404</link>
      <description>&lt;P&gt;So I've enabled lz0 compression as per HortonWorks guide... I've got 120TB of storage capacity so far and a defacto replication factor of 3. My data usage is at 75% and my manager is starting to wonder if lz0 can be used to compress the the file system "a la windows" where the file system is compressed but the data is accessible "as per usual" through the dfs path?&lt;/P&gt;&lt;P&gt;Any hint would be greatly appreciated....&lt;/P&gt;</description>
      <pubDate>Fri, 09 Sep 2016 02:52:46 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Lz0-is-enabled-now-what/m-p/143818#M106404</guid>
      <dc:creator>Eric_Periard</dc:creator>
      <dc:date>2016-09-09T02:52:46Z</dc:date>
    </item>
    <item>
      <title>Re: Lz0 is enabled now what?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Lz0-is-enabled-now-what/m-p/143819#M106405</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/10999/ericperiard.html" nodeid="10999"&gt;@Eric Periard&lt;/A&gt;&lt;P&gt;I think what I am understanding from your question is your manager wants file blocks compressed at a lower level than HDFS (like at linux level). Is that right? If not, please elaborate your question.&lt;/P&gt;&lt;P&gt;When you enable compression for Hadoop using Lzo, you are compressing files going into HDFS. Remember HDFS splits the files into its blocks and places blocks on different nodes (after all, it's a distributed file system). LZO is one of the compression mechanisms that allows for compressed blocks for files that have been split on different machines. It provides a good balance between read/write speed and compression ratio. &lt;/P&gt;&lt;P&gt;You would have to compress all your files either upon ingestion or later on. At Hadoop level, to enable compression for the output being written by your MapReduce jobs, see the following link.&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0/bk_hdfs_admin_tools/content/ch04.html" target="_blank"&gt;https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0/bk_hdfs_admin_tools/content/ch04.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 09 Sep 2016 03:25:26 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Lz0-is-enabled-now-what/m-p/143819#M106405</guid>
      <dc:creator>mqureshi</dc:creator>
      <dc:date>2016-09-09T03:25:26Z</dc:date>
    </item>
    <item>
      <title>Re: Lz0 is enabled now what?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Lz0-is-enabled-now-what/m-p/143820#M106406</link>
      <description>&lt;P&gt;@&lt;A href="https://community.hortonworks.com/users/10999/ericperiard.html"&gt;Eric Periard&lt;/A&gt;&lt;/P&gt;&lt;P&gt;HDFS does not need ext4, ext3 or xfs file system to function. It can seat in top of raw JBOD disks. If that is the case, there is no more opportunity of further compression. If in your case is in top of a file system that is questionable as a best practice. What is your situation?&lt;/P&gt;&lt;P&gt;Anyhow, there are other things you can do maximize even further your storage, e.g. ORC format.&lt;/P&gt;&lt;P&gt;Keep in mind that super-compression requires more and more cores available for processing. Storage is usually cheaper and a super-compression can bring also performance problems, CPU bottleneck etc. All in moderation.&lt;/P&gt;</description>
      <pubDate>Fri, 09 Sep 2016 06:32:33 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Lz0-is-enabled-now-what/m-p/143820#M106406</guid>
      <dc:creator>cstanca</dc:creator>
      <dc:date>2016-09-09T06:32:33Z</dc:date>
    </item>
    <item>
      <title>Re: Lz0 is enabled now what?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Lz0-is-enabled-now-what/m-p/143821#M106407</link>
      <description>&lt;P&gt;Agree with &lt;A rel="user" href="https://community.cloudera.com/users/10969/mqureshi.html" nodeid="10969"&gt;@mqureshi&lt;/A&gt; &lt;A rel="user" href="https://community.cloudera.com/users/3486/cstanca.html" nodeid="3486"&gt;@Constantin Stanca&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/3486/cstanca.html" nodeid="3486"&gt;&lt;/A&gt;Would like to add the theme that compression is a strategy and usually not a universal yes or no, or this codec or that.   Important questions to ask for your data are: Will it be processed frequently, rarely or never (cold storage)?  How critical is performance when it is processed?  Which leads to: Which file format/compression codec if any for each dataset? &lt;/P&gt;&lt;P&gt;The following are good references for compression and file format strategies (takes some thinking and evaluating):&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;A href="http://www.slideshare.net/Hadoop_Summit/kamat-singh-june27425pmroom210cv2"&gt;http://www.slideshare.net/Hadoop_Summit/kamat-singh-june27425pmroom210cv2&lt;/A&gt;&lt;/LI&gt;&lt;LI&gt;&lt;A href="http://comphadoop.weebly.com/"&gt;http://comphadoop.weebly.com/&lt;/A&gt;&lt;/LI&gt;&lt;LI&gt;&lt;A href="http://www.dummies.com/programming/big-data/hadoop/hadoop-for-dummies/"&gt;http://www.dummies.com/programming/big-data/hadoop/hadoop-for-dummies/&lt;/A&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;After formulating a strategy, think about dividing your hdfs filepaths into zones in accordance with your strategy.&lt;/P&gt;</description>
      <pubDate>Tue, 13 Sep 2016 01:46:53 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Lz0-is-enabled-now-what/m-p/143821#M106407</guid>
      <dc:creator>gkeys</dc:creator>
      <dc:date>2016-09-13T01:46:53Z</dc:date>
    </item>
    <item>
      <title>Re: Lz0 is enabled now what?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Lz0-is-enabled-now-what/m-p/143822#M106408</link>
      <description>&lt;P&gt;Basically I'm looking for "Block Level" type of compression of pre-existing data.&lt;/P&gt;&lt;P&gt;I went through all the settings and Lzo is now enabled, just not sure how to compress existing data.&lt;/P&gt;&lt;P&gt;Mind you a SysOps and not DevOps so dealing with programming languages is not my forte.&lt;/P&gt;</description>
      <pubDate>Thu, 29 Sep 2016 22:39:53 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Lz0-is-enabled-now-what/m-p/143822#M106408</guid>
      <dc:creator>Eric_Periard</dc:creator>
      <dc:date>2016-09-29T22:39:53Z</dc:date>
    </item>
    <item>
      <title>Re: Lz0 is enabled now what?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Lz0-is-enabled-now-what/m-p/143823#M106409</link>
      <description>&lt;PRE&gt;hadoop-examples-1.1.0-SNAPSHOT.jar&lt;/PRE&gt;&lt;P&gt;&lt;STRONG&gt;I don't seem to have the above file at all on either of my nn and snn or other masters?&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Option I:&lt;/STRONG&gt; To use &lt;CODE&gt;GzipCodec&lt;/CODE&gt; with a one-time only job:&lt;/P&gt;&lt;PRE&gt;hadoop jar hadoop-examples-1.1.0-SNAPSHOT.jar sort sbr"-Dmapred.compress.map.output=true" sbr"-Dmapred.map.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec"sbr "-Dmapred.output.compress=true" sbr"-Dmapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec"sbr -outKey org.apache.hadoop.io.Textsbr -outValue org.apache.hadoop.io.Text input output&lt;/PRE&gt;</description>
      <pubDate>Thu, 29 Sep 2016 22:41:38 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Lz0-is-enabled-now-what/m-p/143823#M106409</guid>
      <dc:creator>Eric_Periard</dc:creator>
      <dc:date>2016-09-29T22:41:38Z</dc:date>
    </item>
    <item>
      <title>Re: Lz0 is enabled now what?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Lz0-is-enabled-now-what/m-p/143824#M106410</link>
      <description>&lt;P&gt;What I see is:&lt;/P&gt;&lt;P&gt;/usr/hdp/2.2.6.0-2800/knox/samples/hadoop-examples.jar
/usr/hdp/2.4.0.0-169/knox/samples/&lt;STRONG&gt;hadoop-examples.jar &lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;/STRONG&gt;/usr/hdp/2.4.2.0-258/knox/samples/hadoop-examples.jar
/usr/lib/hue/apps/jobsub/data/examples/&lt;STRONG&gt;hadoop-examples.jar&lt;/STRONG&gt; &lt;/P&gt;&lt;P&gt;/usr/lib/hue/apps/oozie/examples/lib/&lt;STRONG&gt;hadoop-examples.jar&lt;/STRONG&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 29 Sep 2016 22:44:33 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Lz0-is-enabled-now-what/m-p/143824#M106410</guid>
      <dc:creator>Eric_Periard</dc:creator>
      <dc:date>2016-09-29T22:44:33Z</dc:date>
    </item>
    <item>
      <title>Re: Lz0 is enabled now what?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Lz0-is-enabled-now-what/m-p/143825#M106411</link>
      <description>&lt;P&gt;&lt;A href="https://community.hortonworks.com/questions/55595/lz0-is-enabled-now-what.html#"&gt;@Eric Periard&lt;/A&gt;&lt;/P&gt;&lt;P&gt;You cannot just compress pre-existing data by simply enabling compression. You woul have to compress existing data which will generate new compressed files. It is my understanding that you cannot compress existing data in place. The way to do this is to compress the existing data which will create new compressed files and then delete the uncompressed data/original files.&lt;/P&gt;</description>
      <pubDate>Thu, 29 Sep 2016 22:45:31 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Lz0-is-enabled-now-what/m-p/143825#M106411</guid>
      <dc:creator>mqureshi</dc:creator>
      <dc:date>2016-09-29T22:45:31Z</dc:date>
    </item>
    <item>
      <title>Re: Lz0 is enabled now what?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Lz0-is-enabled-now-what/m-p/143826#M106412</link>
      <description>&lt;P&gt;&lt;STRONG&gt;So I tried has root:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;[root@nn samples]# hadoop jar hadoop-examples.jar sort sbr"-Dmapred.compress.map.output=true" sbr"-Dmapred.map.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec"sbr "-Dmapred.output.compress=true" sbr"-Dmapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec"sbr -outKey org.apache.hadoop.io.Textsbr -outValue org.apache.hadoop.io.Text input output &lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;WARNING: Use "yarn jar" to launch YARN applications.&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Then I tried with yarn running as root:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Exception in thread "main" java.lang.ClassNotFoundException: sort
        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Then I sudo su - yarn&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;/STRONG&gt; [yarn@nn ~]$ yarn jar hadoop-examples.jar sort sbr"-Dmapred.compress.map.output=true" sbr"-Dmapred.map.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec"sbr "-Dmapred.output.compress=true" sbr"-Dmapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec"sbr -outKey org.apache.hadoop.io.Textsbr -outValue org.apache.hadoop.io.Text input output &lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Not a valid JAR: /home/yarn/hadoop-examples.jar&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;So far manually trying to run that job is a no-go &lt;span class="lia-unicode-emoji" title=":confused_face:"&gt;😕&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 29 Sep 2016 22:49:17 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Lz0-is-enabled-now-what/m-p/143826#M106412</guid>
      <dc:creator>Eric_Periard</dc:creator>
      <dc:date>2016-09-29T22:49:17Z</dc:date>
    </item>
    <item>
      <title>Re: Lz0 is enabled now what?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Lz0-is-enabled-now-what/m-p/143827#M106413</link>
      <description>&lt;P&gt;I went through the tutorial above for HDP 2.4.2 without success...&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="8070-compression-failed.png" style="width: 878px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/21616iE7AE1FB54E504818/image-size/medium?v=v2&amp;amp;px=400" role="button" title="8070-compression-failed.png" alt="8070-compression-failed.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 19 Aug 2019 08:01:41 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Lz0-is-enabled-now-what/m-p/143827#M106413</guid>
      <dc:creator>Eric_Periard</dc:creator>
      <dc:date>2019-08-19T08:01:41Z</dc:date>
    </item>
    <item>
      <title>Re: Lz0 is enabled now what?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Lz0-is-enabled-now-what/m-p/143828#M106414</link>
      <description>&lt;P&gt;Yeah I've been trying to run the JAR file above... which is essentially running it on pre-existing data but it's failing miserably &lt;span class="lia-unicode-emoji" title=":confused_face:"&gt;😕&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 29 Sep 2016 22:54:51 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Lz0-is-enabled-now-what/m-p/143828#M106414</guid>
      <dc:creator>Eric_Periard</dc:creator>
      <dc:date>2016-09-29T22:54:51Z</dc:date>
    </item>
    <item>
      <title>Re: Lz0 is enabled now what?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Lz0-is-enabled-now-what/m-p/143829#M106415</link>
      <description>&lt;P&gt;I changed the to the right directory location, still the same error though &lt;span class="lia-unicode-emoji" title=":confused_face:"&gt;😕&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 29 Sep 2016 22:58:40 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Lz0-is-enabled-now-what/m-p/143829#M106415</guid>
      <dc:creator>Eric_Periard</dc:creator>
      <dc:date>2016-09-29T22:58:40Z</dc:date>
    </item>
    <item>
      <title>Re: Lz0 is enabled now what?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Lz0-is-enabled-now-what/m-p/143830#M106416</link>
      <description>&lt;P&gt;Is there no actual step by step guide on how to do that... Been searching for weeks, nothing concrete so far.&lt;/P&gt;</description>
      <pubDate>Thu, 29 Sep 2016 23:52:11 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Lz0-is-enabled-now-what/m-p/143830#M106416</guid>
      <dc:creator>Eric_Periard</dc:creator>
      <dc:date>2016-09-29T23:52:11Z</dc:date>
    </item>
    <item>
      <title>Re: Lz0 is enabled now what?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Lz0-is-enabled-now-what/m-p/143831#M106417</link>
      <description>&lt;P&gt;So I got the job to "start"&lt;/P&gt;&lt;P&gt;[ INFO ] [main] Task Id : attempt_1475173438027_0002_m_000000_0, Status : FAILED
Error: Java heap space
Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
[ INFO ] [main] Task Id : attempt_1475173438027_0002_m_000000_1, Status : FAILED
Error: Java heap space
Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
[ INFO ] [main] Task Id : attempt_1475173438027_0002_m_000000_2, Status : FAILED
Error: Java heap space
[ INFO ] [main]  map 100% reduce 100%
[ INFO ] [main] Job job_1475173438027_0002 failed with state FAILED due to: Task failed task_1475173438027_0002_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0&lt;/P&gt;</description>
      <pubDate>Fri, 30 Sep 2016 01:28:42 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Lz0-is-enabled-now-what/m-p/143831#M106417</guid>
      <dc:creator>Eric_Periard</dc:creator>
      <dc:date>2016-09-30T01:28:42Z</dc:date>
    </item>
    <item>
      <title>Re: Lz0 is enabled now what?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Lz0-is-enabled-now-what/m-p/143832#M106418</link>
      <description>&lt;P&gt;Changed YARN Java heap size from 1Gb to 4... it still dies? &lt;/P&gt;</description>
      <pubDate>Fri, 30 Sep 2016 01:29:25 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Lz0-is-enabled-now-what/m-p/143832#M106418</guid>
      <dc:creator>Eric_Periard</dc:creator>
      <dc:date>2016-09-30T01:29:25Z</dc:date>
    </item>
  </channel>
</rss>

