<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: impala forces full table scan in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/impala-forces-full-table-scan/m-p/397855#M250001</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/102590"&gt;@mrblack&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;To avoid full table scan you follow these tips:&lt;/P&gt;&lt;P&gt;1. Ensure proper partition pruning:&lt;/P&gt;&lt;P&gt;&lt;A href="https://impala.apache.org/docs/build/html/topics/impala_partitioning.html#:~:text=the%20impalad%20daemon.-,Partition%20Pruning%20for%20Queries,-Partition%20pruning%20refers" target="_blank"&gt;https://impala.apache.org/docs/build/html/topics/impala_partitioning.html#:~:text=the%20impalad%20daemon.-,Partition%20Pruning%20for%20Queries,-Partition%20pruning%20refers&lt;/A&gt;&lt;/P&gt;&lt;P&gt;2. Re write the query with sub queries.&lt;/P&gt;&lt;P&gt;3. Add explicit hints for join behaviour.&amp;nbsp;Impala supports join hints like brodcast&amp;nbsp;and shuffle&amp;nbsp;that can influence query planning.&lt;/P&gt;&lt;P&gt;After optimising check the explain plan.&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Chethan YM&lt;/P&gt;</description>
    <pubDate>Fri, 22 Nov 2024 06:40:16 GMT</pubDate>
    <dc:creator>ChethanYM</dc:creator>
    <dc:date>2024-11-22T06:40:16Z</dc:date>
    <item>
      <title>impala forces full table scan</title>
      <link>https://community.cloudera.com/t5/Support-Questions/impala-forces-full-table-scan/m-p/393080#M248331</link>
      <description>&lt;P&gt;my cluster is cdh5.14&amp;nbsp; impala 2.11 kudu 1.6.0&lt;/P&gt;&lt;P&gt;impala on kudu,When union all is used to merge more than three select join queries, the third and later select must perform full table scanning. How can I solve the problem of full table scanning?&lt;BR /&gt;A full table scan is not performed when the select join statement is executed alone.&lt;/P&gt;&lt;P&gt;sql as follows:&lt;/P&gt;&lt;P&gt;SELECT&lt;BR /&gt;mb.unitno,&lt;BR /&gt;mb.materialcode,&lt;BR /&gt;mb.starttime,&lt;BR /&gt;mb.length,&lt;BR /&gt;ib.equipmentcode,&lt;BR /&gt;ib.defectclass,&lt;BR /&gt;ib.defectname,&lt;BR /&gt;ib.side&lt;BR /&gt;FROM&lt;BR /&gt;sc1.tba1 mb&lt;BR /&gt;LEFT OUTER JOIN&lt;BR /&gt;sc1.tbb1 ib&lt;BR /&gt;ON&lt;BR /&gt;mb.materialcode = ib.materialcode&lt;BR /&gt;where mb.starttime&amp;gt;'20240827'and mb.starttime&amp;lt;'20240828' )&lt;BR /&gt;UNION ALL&lt;BR /&gt;SELECT&lt;BR /&gt;md.unitno,&lt;BR /&gt;md.materialcode,&lt;BR /&gt;md.starttime,&lt;BR /&gt;md.length,&lt;BR /&gt;id.equipmentcode,&lt;BR /&gt;id.defectclass,&lt;BR /&gt;id.defectname,&lt;BR /&gt;id.side&lt;BR /&gt;FROM&lt;BR /&gt;sc1.tba2 md&lt;BR /&gt;LEFT OUTER JOIN&lt;BR /&gt;sc1.tbb2 id&lt;BR /&gt;ON&lt;BR /&gt;md.materialcode = id.materialcode&lt;BR /&gt;where md.starttime&amp;gt;'20240827'and md.starttime&amp;lt;'20240828' )&lt;BR /&gt;UNION ALL&lt;BR /&gt;SELECT&lt;BR /&gt;me.unitno,&lt;BR /&gt;me.materialcode,&lt;BR /&gt;me.starttime,&lt;BR /&gt;me.length,&lt;BR /&gt;ie.equipmentcode,&lt;BR /&gt;ie.defectclass,&lt;BR /&gt;ie.defectname,&lt;BR /&gt;ie.side&lt;BR /&gt;FROM&lt;BR /&gt;sc1.tba3 me&lt;BR /&gt;LEFT OUTER JOIN&lt;BR /&gt;sc1.tbb3 ie&lt;BR /&gt;ON&lt;BR /&gt;me.materialcode = ie.materialcode&lt;BR /&gt;where me.starttime&amp;gt;'20240827'and me.starttime&amp;lt;'20240828' )；&lt;/P&gt;&lt;P&gt;The number of tbb1, tbb2, and tbb3 rows is 800 million，The number of tba1, tba2, and tbba3 rows is 10000.&amp;nbsp;When executing the above sql, a full table scan is performed on sc1.tbb3.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 05 Sep 2024 01:23:28 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/impala-forces-full-table-scan/m-p/393080#M248331</guid>
      <dc:creator>mrblack</dc:creator>
      <dc:date>2024-09-05T01:23:28Z</dc:date>
    </item>
    <item>
      <title>Re: impala forces full table scan</title>
      <link>https://community.cloudera.com/t5/Support-Questions/impala-forces-full-table-scan/m-p/395247#M248903</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/102590"&gt;@mrblack&lt;/a&gt;&amp;nbsp;, how do you know that Impala performs a full table scan?&lt;/P&gt;</description>
      <pubDate>Tue, 15 Oct 2024 15:33:57 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/impala-forces-full-table-scan/m-p/395247#M248903</guid>
      <dc:creator>zegab</dc:creator>
      <dc:date>2024-10-15T15:33:57Z</dc:date>
    </item>
    <item>
      <title>Re: impala forces full table scan</title>
      <link>https://community.cloudera.com/t5/Support-Questions/impala-forces-full-table-scan/m-p/397855#M250001</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/102590"&gt;@mrblack&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;To avoid full table scan you follow these tips:&lt;/P&gt;&lt;P&gt;1. Ensure proper partition pruning:&lt;/P&gt;&lt;P&gt;&lt;A href="https://impala.apache.org/docs/build/html/topics/impala_partitioning.html#:~:text=the%20impalad%20daemon.-,Partition%20Pruning%20for%20Queries,-Partition%20pruning%20refers" target="_blank"&gt;https://impala.apache.org/docs/build/html/topics/impala_partitioning.html#:~:text=the%20impalad%20daemon.-,Partition%20Pruning%20for%20Queries,-Partition%20pruning%20refers&lt;/A&gt;&lt;/P&gt;&lt;P&gt;2. Re write the query with sub queries.&lt;/P&gt;&lt;P&gt;3. Add explicit hints for join behaviour.&amp;nbsp;Impala supports join hints like brodcast&amp;nbsp;and shuffle&amp;nbsp;that can influence query planning.&lt;/P&gt;&lt;P&gt;After optimising check the explain plan.&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Chethan YM&lt;/P&gt;</description>
      <pubDate>Fri, 22 Nov 2024 06:40:16 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/impala-forces-full-table-scan/m-p/397855#M250001</guid>
      <dc:creator>ChethanYM</dc:creator>
      <dc:date>2024-11-22T06:40:16Z</dc:date>
    </item>
  </channel>
</rss>

