<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Kudu web ui - cells read in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Kudu-web-ui-cells-read/m-p/64183#M74003</link>
    <description>&lt;P&gt;Found out that if multiple spark tasks are reading the same tablet (partition) then it counts multiple times the reads. Therefore the total cells read could be much higher than the number of rows in tablet, actual # of tasks x # rows.&lt;/P&gt;</description>
    <pubDate>Wed, 31 Jan 2018 14:31:06 GMT</pubDate>
    <dc:creator>Tomas79</dc:creator>
    <dc:date>2018-01-31T14:31:06Z</dc:date>
    <item>
      <title>Kudu web ui - cells read</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Kudu-web-ui-cells-read/m-p/64111#M74001</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;I thought that during a particular scan Kudu is reporting a number of rows reand in realt-time per each column. At least on small table it was equal to roughly the number of rows in the partition.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;But now I am scanning a 1 billion (1 000 000 000) row table, the table is partitioned into multiple partitions. And the cells read shows 2.3billion 4.6 billion etc.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="image.png" style="width: 357px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/3819iADDE2A9A953B79F2/image-size/large?v=v2&amp;amp;px=999" role="button" title="image.png" alt="image.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Can somebody explain why those numbers are so high?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 12:48:07 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Kudu-web-ui-cells-read/m-p/64111#M74001</guid>
      <dc:creator>Tomas79</dc:creator>
      <dc:date>2022-09-16T12:48:07Z</dc:date>
    </item>
    <item>
      <title>Re: Kudu web ui - cells read</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Kudu-web-ui-cells-read/m-p/64182#M74002</link>
      <description>&lt;P&gt;Found out that if multiple spark tasks are reading the same tablet (partition) then it counts multiple times the reads. Therefore the total cells read could be much higher than the number of rows in tablet, actual # of tasks x # rows.&lt;/P&gt;</description>
      <pubDate>Wed, 31 Jan 2018 14:30:49 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Kudu-web-ui-cells-read/m-p/64182#M74002</guid>
      <dc:creator>Tomas79</dc:creator>
      <dc:date>2018-01-31T14:30:49Z</dc:date>
    </item>
    <item>
      <title>Re: Kudu web ui - cells read</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Kudu-web-ui-cells-read/m-p/64183#M74003</link>
      <description>&lt;P&gt;Found out that if multiple spark tasks are reading the same tablet (partition) then it counts multiple times the reads. Therefore the total cells read could be much higher than the number of rows in tablet, actual # of tasks x # rows.&lt;/P&gt;</description>
      <pubDate>Wed, 31 Jan 2018 14:31:06 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Kudu-web-ui-cells-read/m-p/64183#M74003</guid>
      <dc:creator>Tomas79</dc:creator>
      <dc:date>2018-01-31T14:31:06Z</dc:date>
    </item>
    <item>
      <title>Re: Kudu web ui - cells read</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Kudu-web-ui-cells-read/m-p/64192#M74004</link>
      <description>It looks like your screenshot is of the "scans" dashboard on the web UI.&lt;BR /&gt;This dashboard shows counters for a single scan, and a single scan would&lt;BR /&gt;only come from a single task, not aggregate across them.&lt;BR /&gt;&lt;BR /&gt;I am guessing you're hitting KUDU-2231, a performance bug recently fixed.&lt;BR /&gt;The bug fix appears in CDH 5.14.0. Since this is a performance issue that&lt;BR /&gt;is not a regression and does not affect correctness, we have not yet&lt;BR /&gt;backported to any prior releases.&lt;BR /&gt;&lt;BR /&gt;-Todd&lt;BR /&gt;</description>
      <pubDate>Wed, 31 Jan 2018 18:05:17 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Kudu-web-ui-cells-read/m-p/64192#M74004</guid>
      <dc:creator>Todd Lipcon</dc:creator>
      <dc:date>2018-01-31T18:05:17Z</dc:date>
    </item>
  </channel>
</rss>

