<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: kudu is slower than parquet? in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/kudu-is-slower-than-parquet/m-p/56604#M14064</link>
    <description>&lt;P&gt;Make sure you run COMPUTE STATS after loading the data so that Impala knows how to join the Kudu tables.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;What is the total size of your data set?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am surprised at the difference in your numbers and I think they should be closer if tuned correctly. Regardless, if you don't need to be able to do online inserts and updates, then Kudu won't buy you much over the raw scan speed of an immutable on-disk format like Impala + Parquet on HDFS.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Tue, 27 Jun 2017 22:06:31 GMT</pubDate>
    <dc:creator>mpercy</dc:creator>
    <dc:date>2017-06-27T22:06:31Z</dc:date>
  </channel>
</rss>

