<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: HDFS Compression vs Performance in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Compression-vs-Performance/m-p/109733#M16165</link>
    <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/1390/michelsumbul.html" nodeid="1390"&gt;@Michel Sumbul&lt;/A&gt; Based on the compression type...Yes&lt;/P&gt;&lt;P&gt;&lt;EM&gt;If I correctly understand the slides, I should expect a raise of the CPU usage between 5% and 60% depending of the compression algorythm. That can be really important!&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;I&gt;HBASE - &lt;/I&gt;&lt;A target="_blank" href="https://www.linkedin.com/pulse/importance-compression-hbase-performance-tuning-part-deshpande"&gt;Link&lt;/A&gt;  (Unofficial)&lt;/P&gt;&lt;P&gt;HBASE official guide - Production systems should use compression with their ColumnFamily definitions. See &lt;A href="http://hbase.apache.org/0.94/book/compression.html"&gt;Appendix C, &lt;EM&gt;Compression In HBase&lt;/EM&gt;&lt;/A&gt; for more information.&lt;/P&gt;</description>
    <pubDate>Mon, 25 Jan 2016 07:07:52 GMT</pubDate>
    <dc:creator>nsabharwal</dc:creator>
    <dc:date>2016-01-25T07:07:52Z</dc:date>
    <item>
      <title>HDFS Compression vs Performance</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Compression-vs-Performance/m-p/109729#M16161</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I'm asking myself how to have a good idea of the impact on the performance to use compression in HDFS?&lt;/P&gt;&lt;P&gt;This question is important because, if I implement the compression, should I considere increasing the CPU need of 10%, 20 %, 30% for the same performance?&lt;/P&gt;&lt;P&gt;I know that I can win on performance because less IOPS will be needed but what's about CPU?&lt;/P&gt;&lt;P&gt;I would like also ask, what will be the impact on HBase performance?&lt;/P&gt;&lt;P&gt;Many thanks in advance!&lt;/P&gt;</description>
      <pubDate>Sun, 24 Jan 2016 22:31:12 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Compression-vs-Performance/m-p/109729#M16161</guid>
      <dc:creator>michelsumbul</dc:creator>
      <dc:date>2016-01-24T22:31:12Z</dc:date>
    </item>
    <item>
      <title>Re: HDFS Compression vs Performance</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Compression-vs-Performance/m-p/109730#M16162</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/1390/michelsumbul.html" nodeid="1390"&gt;@Michel Sumbul&lt;/A&gt; Very good question ..Please see this to start with &lt;A target="_blank" href="http://www.slideshare.net/Hadoop_Summit/kamat-singh-june27425pmroom210cv2"&gt;http://www.slideshare.net/Hadoop_Summit/kamat-singh-june27425pmroom210cv2&lt;/A&gt;&lt;/P&gt;&lt;P&gt;It was 3 years ago.  &lt;A href="http://hortonworks.com/blog/orcfile-in-hdp-2-better-compression-better-performance/" target="_blank"&gt;http://hortonworks.com/blog/orcfile-in-hdp-2-better-compression-better-performance/&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 24 Jan 2016 22:50:09 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Compression-vs-Performance/m-p/109730#M16162</guid>
      <dc:creator>nsabharwal</dc:creator>
      <dc:date>2016-01-24T22:50:09Z</dc:date>
    </item>
    <item>
      <title>Re: HDFS Compression vs Performance</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Compression-vs-Performance/m-p/109731#M16163</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/1390/michelsumbul.html" nodeid="1390"&gt;@Michel Sumbul&lt;/A&gt; in terms of HBase, your mileage can vary, it depends on your workloads, in some use cases I've seen a lot better performance with compression on and in some not. There are also multiple levels of compression in HBase, (per column family, you can compress rowkeys only or both). &lt;/P&gt;</description>
      <pubDate>Mon, 25 Jan 2016 01:16:42 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Compression-vs-Performance/m-p/109731#M16163</guid>
      <dc:creator>aervits</dc:creator>
      <dc:date>2016-01-25T01:16:42Z</dc:date>
    </item>
    <item>
      <title>Re: HDFS Compression vs Performance</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Compression-vs-Performance/m-p/109732#M16164</link>
      <description>&lt;P&gt;Thanks guys for the fast reply!&lt;/P&gt;&lt;P&gt;@nsabharwal : If I correctly understand the slides, I should expect a raise of the CPU usage between 5% and 60% depending of the compression algorythm. That can be really important!&lt;/P&gt;&lt;P&gt;@aervits : do you have some benchmarks, test results to have a idea?&lt;/P&gt;&lt;P&gt;Many thanks guys!&lt;/P&gt;&lt;P&gt;Michel&lt;/P&gt;</description>
      <pubDate>Mon, 25 Jan 2016 03:39:03 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Compression-vs-Performance/m-p/109732#M16164</guid>
      <dc:creator>michelsumbul</dc:creator>
      <dc:date>2016-01-25T03:39:03Z</dc:date>
    </item>
    <item>
      <title>Re: HDFS Compression vs Performance</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Compression-vs-Performance/m-p/109733#M16165</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/1390/michelsumbul.html" nodeid="1390"&gt;@Michel Sumbul&lt;/A&gt; Based on the compression type...Yes&lt;/P&gt;&lt;P&gt;&lt;EM&gt;If I correctly understand the slides, I should expect a raise of the CPU usage between 5% and 60% depending of the compression algorythm. That can be really important!&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;I&gt;HBASE - &lt;/I&gt;&lt;A target="_blank" href="https://www.linkedin.com/pulse/importance-compression-hbase-performance-tuning-part-deshpande"&gt;Link&lt;/A&gt;  (Unofficial)&lt;/P&gt;&lt;P&gt;HBASE official guide - Production systems should use compression with their ColumnFamily definitions. See &lt;A href="http://hbase.apache.org/0.94/book/compression.html"&gt;Appendix C, &lt;EM&gt;Compression In HBase&lt;/EM&gt;&lt;/A&gt; for more information.&lt;/P&gt;</description>
      <pubDate>Mon, 25 Jan 2016 07:07:52 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Compression-vs-Performance/m-p/109733#M16165</guid>
      <dc:creator>nsabharwal</dc:creator>
      <dc:date>2016-01-25T07:07:52Z</dc:date>
    </item>
    <item>
      <title>Re: HDFS Compression vs Performance</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Compression-vs-Performance/m-p/109734#M16166</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/1390/michelsumbul.html" nodeid="1390"&gt;@Michel Sumbul&lt;/A&gt; I was able to reach 840k/sec reads on AWS, Centos7, XFS filesystem, 9 nodes, 12 7200 RPM drives in non-mapreduce mode. Same hardware write-only test resulted in 185k/sec. For mixed workload, I got 148k/sec writes and 270k/s reads. This is with snappy compression on.&lt;/P&gt;</description>
      <pubDate>Mon, 25 Jan 2016 21:41:30 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Compression-vs-Performance/m-p/109734#M16166</guid>
      <dc:creator>aervits</dc:creator>
      <dc:date>2016-01-25T21:41:30Z</dc:date>
    </item>
  </channel>
</rss>

