<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Very slow CodeGen taking 80% of runtime in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Very-slow-CodeGen-taking-80-of-runtime/m-p/67871#M79133</link>
    <description>&lt;P&gt;We recently enabled hdfs caching for two tables to try and speed up a whole class of queries that are very similar, generally following this pattern:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;SELECT x,y,z FROM (&lt;/P&gt;&lt;P&gt;SELECT x,y,z FROM table1 WHERE blah&lt;/P&gt;&lt;P&gt;UNION ALL&lt;/P&gt;&lt;P&gt;SELECT x,y,z FROM table2 WHERE blah&lt;/P&gt;&lt;P&gt;) x&lt;BR /&gt;ORDER BY x DESC, y DESC&lt;/P&gt;&lt;P&gt;LIMIT 20001 OFFSET 0&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;... but we didn't get much runtime improvement. Digging in it looks like 80% of the time is spent on CodeGen: 5.25s, of that CompileTime: 1.67s and OptimizationTime: 3.51s (see profile fragment below for this sample run).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;With set DISABLE_CODEGEN=true query goes from ~6 seconds to ~1 second, but docs state this should not be used generally, so hesitant to add that in actual live production reports, and would rather want to understand root cause.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Both tables are parquet, fully hdfs-cached. Both are wide-ish: 253 and 126 cols respectively, but inner queries project only 20 cols to the outer.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;CDH 5.13 / Impala 2.10.&amp;nbsp; Happy to send full profile file by direct mail.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks in advance,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;-mauricio&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;78:MERGING-EXCHANGE 1 5s307ms 5s307ms 73 101 0 0 UNPARTITIONED&lt;BR /&gt;49:TOP-N 30 341.689us 880.634us 73 101 873.00 KB 39.28 KB&lt;BR /&gt;00:UNION 30 240.707us 3.190ms 73 1.61K 8.81 MB 0&lt;/P&gt;&lt;P&gt;...&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;F35:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1&lt;BR /&gt;| Per-Host Resources: mem-estimate=0B mem-reservation=0B&lt;BR /&gt;PLAN-ROOT SINK&lt;BR /&gt;| mem-estimate=0B mem-reservation=0B&lt;BR /&gt;|&lt;BR /&gt;78:MERGING-EXCHANGE [UNPARTITIONED]&lt;BR /&gt;| order by: action_date DESC, action_id ASC&lt;BR /&gt;| limit: 101&lt;BR /&gt;| mem-estimate=0B mem-reservation=0B&lt;BR /&gt;| tuple-ids=47 row-size=398B cardinality=101&lt;BR /&gt;|&lt;BR /&gt;F34:PLAN FRAGMENT [RANDOM] hosts=18 instances=18&lt;BR /&gt;Per-Host Resources: mem-estimate=206.48MB mem-reservation=14.44MB&lt;BR /&gt;49:TOP-N [LIMIT=101]&lt;BR /&gt;| order by: action_date DESC, action_id ASC&lt;BR /&gt;| mem-estimate=39.28KB mem-reservation=0B&lt;BR /&gt;| tuple-ids=47 row-size=398B cardinality=101&lt;BR /&gt;|&lt;BR /&gt;00:UNION&lt;/P&gt;&lt;P&gt;&amp;nbsp;...&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt; F34 Fragment for a sample node (all very similar):&lt;BR /&gt;Hdfs split stats (&amp;lt;volume id&amp;gt;:&amp;lt;# splits&amp;gt;/&amp;lt;split lengths&amp;gt;): 8:1/38.32 MB&lt;BR /&gt;Filter 4 arrival: 5s339ms&lt;BR /&gt;AverageThreadTokens: 1.00&lt;BR /&gt;BloomFilterBytes: 3.0 MiB&lt;BR /&gt;InactiveTotalTime: 0ns&lt;BR /&gt;PeakMemoryUsage: 27.4 MiB&lt;BR /&gt;PeakReservation: 14.4 MiB&lt;BR /&gt;PeakUsedReservation: 0 B&lt;BR /&gt;PerHostPeakMemUsage: 58.6 MiB&lt;BR /&gt;RowsProduced: 1&lt;BR /&gt;TotalNetworkReceiveTime: 261.74us&lt;BR /&gt;TotalNetworkSendTime: 313.68us&lt;BR /&gt;TotalStorageWaitTime: 4.96us&lt;BR /&gt;TotalThreadsInvoluntaryContextSwitches: 583&lt;BR /&gt;TotalThreadsTotalWallClockTime: 5.37s&lt;BR /&gt;TotalThreadsSysTime: 53ms&lt;BR /&gt;TotalThreadsUserTime: 5.20s&lt;BR /&gt;TotalThreadsVoluntaryContextSwitches: 169&lt;BR /&gt;TotalTime: 5.43s&lt;BR /&gt;&amp;gt;&amp;gt; Fragment Instance Lifecycle Timings (0ns)&lt;BR /&gt;&amp;gt;&amp;gt; DataStreamSender (dst_id=78) (1ms)&lt;BR /&gt;&amp;gt;&amp;gt; CodeGen (5.25s)&lt;BR /&gt;CodegenTime: 26ms&lt;BR /&gt;CompileTime: 1.67s &amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt; ????&lt;BR /&gt;InactiveTotalTime: 0ns&lt;BR /&gt;LoadTime: 0ns&lt;BR /&gt;ModuleBitcodeSize: 1.9 MiB&lt;BR /&gt;NumFunctions: 729&lt;BR /&gt;NumInstructions: 35,078&lt;BR /&gt;OptimizationTime: 3.51s &amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt; ????&lt;BR /&gt;PeakMemoryUsage: 17.1 MiB&lt;BR /&gt;PrepareTime: 66ms&lt;BR /&gt;TotalTime: 5.25s&lt;BR /&gt;&amp;gt;&amp;gt; SORT_NODE (id=49) (94ms)&lt;BR /&gt;&amp;gt;&amp;gt; UNION_NODE (id=0) (93ms)&lt;BR /&gt;&amp;gt;&amp;gt; HASH_JOIN_NODE (id=48) (9ms)&lt;/P&gt;</description>
    <pubDate>Fri, 16 Sep 2022 13:17:52 GMT</pubDate>
    <dc:creator>mauricio</dc:creator>
    <dc:date>2022-09-16T13:17:52Z</dc:date>
    <item>
      <title>Very slow CodeGen taking 80% of runtime</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Very-slow-CodeGen-taking-80-of-runtime/m-p/67871#M79133</link>
      <description>&lt;P&gt;We recently enabled hdfs caching for two tables to try and speed up a whole class of queries that are very similar, generally following this pattern:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;SELECT x,y,z FROM (&lt;/P&gt;&lt;P&gt;SELECT x,y,z FROM table1 WHERE blah&lt;/P&gt;&lt;P&gt;UNION ALL&lt;/P&gt;&lt;P&gt;SELECT x,y,z FROM table2 WHERE blah&lt;/P&gt;&lt;P&gt;) x&lt;BR /&gt;ORDER BY x DESC, y DESC&lt;/P&gt;&lt;P&gt;LIMIT 20001 OFFSET 0&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;... but we didn't get much runtime improvement. Digging in it looks like 80% of the time is spent on CodeGen: 5.25s, of that CompileTime: 1.67s and OptimizationTime: 3.51s (see profile fragment below for this sample run).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;With set DISABLE_CODEGEN=true query goes from ~6 seconds to ~1 second, but docs state this should not be used generally, so hesitant to add that in actual live production reports, and would rather want to understand root cause.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Both tables are parquet, fully hdfs-cached. Both are wide-ish: 253 and 126 cols respectively, but inner queries project only 20 cols to the outer.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;CDH 5.13 / Impala 2.10.&amp;nbsp; Happy to send full profile file by direct mail.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks in advance,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;-mauricio&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;78:MERGING-EXCHANGE 1 5s307ms 5s307ms 73 101 0 0 UNPARTITIONED&lt;BR /&gt;49:TOP-N 30 341.689us 880.634us 73 101 873.00 KB 39.28 KB&lt;BR /&gt;00:UNION 30 240.707us 3.190ms 73 1.61K 8.81 MB 0&lt;/P&gt;&lt;P&gt;...&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;F35:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1&lt;BR /&gt;| Per-Host Resources: mem-estimate=0B mem-reservation=0B&lt;BR /&gt;PLAN-ROOT SINK&lt;BR /&gt;| mem-estimate=0B mem-reservation=0B&lt;BR /&gt;|&lt;BR /&gt;78:MERGING-EXCHANGE [UNPARTITIONED]&lt;BR /&gt;| order by: action_date DESC, action_id ASC&lt;BR /&gt;| limit: 101&lt;BR /&gt;| mem-estimate=0B mem-reservation=0B&lt;BR /&gt;| tuple-ids=47 row-size=398B cardinality=101&lt;BR /&gt;|&lt;BR /&gt;F34:PLAN FRAGMENT [RANDOM] hosts=18 instances=18&lt;BR /&gt;Per-Host Resources: mem-estimate=206.48MB mem-reservation=14.44MB&lt;BR /&gt;49:TOP-N [LIMIT=101]&lt;BR /&gt;| order by: action_date DESC, action_id ASC&lt;BR /&gt;| mem-estimate=39.28KB mem-reservation=0B&lt;BR /&gt;| tuple-ids=47 row-size=398B cardinality=101&lt;BR /&gt;|&lt;BR /&gt;00:UNION&lt;/P&gt;&lt;P&gt;&amp;nbsp;...&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt; F34 Fragment for a sample node (all very similar):&lt;BR /&gt;Hdfs split stats (&amp;lt;volume id&amp;gt;:&amp;lt;# splits&amp;gt;/&amp;lt;split lengths&amp;gt;): 8:1/38.32 MB&lt;BR /&gt;Filter 4 arrival: 5s339ms&lt;BR /&gt;AverageThreadTokens: 1.00&lt;BR /&gt;BloomFilterBytes: 3.0 MiB&lt;BR /&gt;InactiveTotalTime: 0ns&lt;BR /&gt;PeakMemoryUsage: 27.4 MiB&lt;BR /&gt;PeakReservation: 14.4 MiB&lt;BR /&gt;PeakUsedReservation: 0 B&lt;BR /&gt;PerHostPeakMemUsage: 58.6 MiB&lt;BR /&gt;RowsProduced: 1&lt;BR /&gt;TotalNetworkReceiveTime: 261.74us&lt;BR /&gt;TotalNetworkSendTime: 313.68us&lt;BR /&gt;TotalStorageWaitTime: 4.96us&lt;BR /&gt;TotalThreadsInvoluntaryContextSwitches: 583&lt;BR /&gt;TotalThreadsTotalWallClockTime: 5.37s&lt;BR /&gt;TotalThreadsSysTime: 53ms&lt;BR /&gt;TotalThreadsUserTime: 5.20s&lt;BR /&gt;TotalThreadsVoluntaryContextSwitches: 169&lt;BR /&gt;TotalTime: 5.43s&lt;BR /&gt;&amp;gt;&amp;gt; Fragment Instance Lifecycle Timings (0ns)&lt;BR /&gt;&amp;gt;&amp;gt; DataStreamSender (dst_id=78) (1ms)&lt;BR /&gt;&amp;gt;&amp;gt; CodeGen (5.25s)&lt;BR /&gt;CodegenTime: 26ms&lt;BR /&gt;CompileTime: 1.67s &amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt; ????&lt;BR /&gt;InactiveTotalTime: 0ns&lt;BR /&gt;LoadTime: 0ns&lt;BR /&gt;ModuleBitcodeSize: 1.9 MiB&lt;BR /&gt;NumFunctions: 729&lt;BR /&gt;NumInstructions: 35,078&lt;BR /&gt;OptimizationTime: 3.51s &amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt; ????&lt;BR /&gt;PeakMemoryUsage: 17.1 MiB&lt;BR /&gt;PrepareTime: 66ms&lt;BR /&gt;TotalTime: 5.25s&lt;BR /&gt;&amp;gt;&amp;gt; SORT_NODE (id=49) (94ms)&lt;BR /&gt;&amp;gt;&amp;gt; UNION_NODE (id=0) (93ms)&lt;BR /&gt;&amp;gt;&amp;gt; HASH_JOIN_NODE (id=48) (9ms)&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 13:17:52 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Very-slow-CodeGen-taking-80-of-runtime/m-p/67871#M79133</guid>
      <dc:creator>mauricio</dc:creator>
      <dc:date>2022-09-16T13:17:52Z</dc:date>
    </item>
    <item>
      <title>Re: Very slow CodeGen taking 80% of runtime</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Very-slow-CodeGen-taking-80-of-runtime/m-p/67894#M79134</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/1843"&gt;@mauricio&lt;/a&gt;&lt;BR /&gt;&lt;BR /&gt;I think the solution is sample, you must to SET &lt;SPAN&gt;DISABLE_CODEGEN to true! also it's advised by cloudera documentation!&lt;/SPAN&gt;!&lt;BR /&gt;&lt;BR /&gt;source:&amp;nbsp;&lt;A href="https://www.cloudera.com/documentation/enterprise/latest/topics/impala_disable_codegen.html" target="_self"&gt;https://www.cloudera.com/documentation/enterprise/latest/topics/impala_disable_codegen.html&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;Good luck.&lt;/P&gt;</description>
      <pubDate>Tue, 05 Jun 2018 10:36:10 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Very-slow-CodeGen-taking-80-of-runtime/m-p/67894#M79134</guid>
      <dc:creator>AcharkiMed</dc:creator>
      <dc:date>2018-06-05T10:36:10Z</dc:date>
    </item>
    <item>
      <title>Re: Very slow CodeGen taking 80% of runtime</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Very-slow-CodeGen-taking-80-of-runtime/m-p/67905#M79135</link>
      <description>Thanks, right I know I can do that but I'm hoping to figure out the root cause rather than paper over it. Plus it makes me nervous to do so for a whole class of queries/reports.. that doc page does say "... Do not otherwise run with this setting turned on, because it results in lower overall performance.</description>
      <pubDate>Tue, 05 Jun 2018 17:04:09 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Very-slow-CodeGen-taking-80-of-runtime/m-p/67905#M79135</guid>
      <dc:creator>mauricio</dc:creator>
      <dc:date>2018-06-05T17:04:09Z</dc:date>
    </item>
    <item>
      <title>Re: Very slow CodeGen taking 80% of runtime</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Very-slow-CodeGen-taking-80-of-runtime/m-p/67913#M79136</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/1843"&gt;@mauricio&lt;/a&gt;I agree it's not great to turn it on globally. I'd be interested in seeing the query profile to understand what happened. We've made some codegen time improvements but there are still remaining issues so would be good to see if it's something we've fixed or not.&lt;/P&gt;</description>
      <pubDate>Tue, 05 Jun 2018 21:17:48 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Very-slow-CodeGen-taking-80-of-runtime/m-p/67913#M79136</guid>
      <dc:creator>Tim Armstrong</dc:creator>
      <dc:date>2018-06-05T21:17:48Z</dc:date>
    </item>
    <item>
      <title>Re: Very slow CodeGen taking 80% of runtime</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Very-slow-CodeGen-taking-80-of-runtime/m-p/67914#M79137</link>
      <description>&lt;P&gt;Yeah we definitely wouldn't want to do globally.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;We tried to do&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;set DISABLE_CODEGEN=true;&lt;/PRE&gt;&lt;P&gt;right before our sql in the report but driver fails with a&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;[Simba][JDBC](11300) A ResultSet was expected but not generated&lt;/PRE&gt;&lt;P&gt;which is really sad, I had thought we could specify any of these hints right in the sql.&amp;nbsp; Doing so in the jdbc url is not an option because same connection is shared by all of our thousands of reports, only 10% or so of which are affected by this.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/11593"&gt;@Tim Armstrong&lt;/a&gt;&amp;nbsp;I tried to guess your Cloudera email and sent you the profile directly.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 05 Jun 2018 22:09:01 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Very-slow-CodeGen-taking-80-of-runtime/m-p/67914#M79137</guid>
      <dc:creator>mauricio</dc:creator>
      <dc:date>2018-06-05T22:09:01Z</dc:date>
    </item>
    <item>
      <title>Re: Very slow CodeGen taking 80% of runtime</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Very-slow-CodeGen-taking-80-of-runtime/m-p/67916#M79138</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/1843"&gt;@mauricio&lt;/a&gt;thanks for the profile. I think you might be better off tweaking DISABLE_CODEGEN_ROWS_THRESHOLD instead of using the big hammer of DISABLE_CODEGEN.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The way that option works is that codegen is disabled automatically if the planner detects that no point in the query plan processes that number of rows per backend. The default is 50,000. E.g. if your query scans 100,000 rows split across three backends (33,333 per backend), it will disable codegen automatically.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Instead of setting DISABLE_CODEGEN, I'd suggest increasing the value first. Based on the profile you sent me, it looks like something like 400000 might be sufficient for that query at least.&lt;/P&gt;</description>
      <pubDate>Tue, 05 Jun 2018 23:25:42 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Very-slow-CodeGen-taking-80-of-runtime/m-p/67916#M79138</guid>
      <dc:creator>Tim Armstrong</dc:creator>
      <dc:date>2018-06-05T23:25:42Z</dc:date>
    </item>
    <item>
      <title>Re: Very slow CodeGen taking 80% of runtime</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Very-slow-CodeGen-taking-80-of-runtime/m-p/67934#M79139</link>
      <description>&lt;P&gt;Thanks&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/11593"&gt;@Tim Armstrong&lt;/a&gt;.&amp;nbsp; Hmm&amp;nbsp;I can't find that option in the &lt;A href="https://www.cloudera.com/documentation/enterprise/latest/topics/impala_query_options.html" target="_self"&gt;current docs&lt;/A&gt;, is it just undocumented? Or do you mean&amp;nbsp;SCAN_NODE_CODEGEN_THRESHOLD ? because&amp;nbsp;there is at least 1 node (from an often used dimension that will apply to most queries) where rows estimate is 2.6 million (though after filtering it becomes only a few).&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;And also even if all scans are under 400K or whatever we set it to, will it help here considering the slow codegen is in a&amp;nbsp;TOP-N step towards the end?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail
...
03:SCAN HDFS 30 48.332ms 103.898ms 17 2.60M 10.93 MB 192.00 MB irdw_prod.media_dim md &lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 06 Jun 2018 16:32:08 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Very-slow-CodeGen-taking-80-of-runtime/m-p/67934#M79139</guid>
      <dc:creator>mauricio</dc:creator>
      <dc:date>2018-06-06T16:32:08Z</dc:date>
    </item>
    <item>
      <title>Re: Very slow CodeGen taking 80% of runtime</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Very-slow-CodeGen-taking-80-of-runtime/m-p/67944#M79140</link>
      <description>&lt;P&gt;The threshold is actually based on the per-host number of rows, so it's 2.6M / 30 = 86K in the example you provided&lt;/P&gt;</description>
      <pubDate>Wed, 06 Jun 2018 21:24:09 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Very-slow-CodeGen-taking-80-of-runtime/m-p/67944#M79140</guid>
      <dc:creator>Tim Armstrong</dc:creator>
      <dc:date>2018-06-06T21:24:09Z</dc:date>
    </item>
    <item>
      <title>Re: Very slow CodeGen taking 80% of runtime</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Very-slow-CodeGen-taking-80-of-runtime/m-p/67947#M79141</link>
      <description>Right! OK, will do that then. Thanks Tim.</description>
      <pubDate>Wed, 06 Jun 2018 21:51:29 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Very-slow-CodeGen-taking-80-of-runtime/m-p/67947#M79141</guid>
      <dc:creator>mauricio</dc:creator>
      <dc:date>2018-06-06T21:51:29Z</dc:date>
    </item>
    <item>
      <title>Re: Very slow CodeGen taking 80% of runtime</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Very-slow-CodeGen-taking-80-of-runtime/m-p/67980#M79142</link>
      <description>&lt;P&gt;&lt;SPAN&gt;FYI&amp;nbsp;&lt;/SPAN&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/11593"&gt;@Tim Armstrong&lt;/a&gt;&amp;nbsp;: sadly, setting&amp;nbsp;&lt;SPAN class="s1"&gt;SCAN_NODE_CODEGEN_THRESHOLD, to any value, had no effect, perhaps since as I mentioned above the slow codegen is NOT in a scan node but a TOP-N towards the end of processing.&amp;nbsp; We are considering setting&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class="s1"&gt;DISABLE_CODEGEN=false on the url for this connection&amp;nbsp;alone (specific to user reports), though we'd need to watch carefully to make sure it doesn't make other reports slow.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="s1"&gt;We'll probably also open a case with our EDH support to try to get to the bottom of why it's slow to begin with.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="s1"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 07 Jun 2018 17:05:09 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Very-slow-CodeGen-taking-80-of-runtime/m-p/67980#M79142</guid>
      <dc:creator>mauricio</dc:creator>
      <dc:date>2018-06-07T17:05:09Z</dc:date>
    </item>
    <item>
      <title>Re: Very slow CodeGen taking 80% of runtime</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Very-slow-CodeGen-taking-80-of-runtime/m-p/69016#M79143</link>
      <description>&lt;P&gt;Never mind my last comment: I was confused because the&amp;nbsp;DISABLE_CODEGEN_ROWS_THRESHOLD setting &lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/11593"&gt;@Tim Armstrong&lt;/a&gt; recommended was not yet documented, so tried using the closest thing I found (SCAN_NODE_CODEGEN_THRESHOLD) which wasn't applicable to our query.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Turns out even though not yet documented,&amp;nbsp;&lt;SPAN&gt;DISABLE_CODEGEN_ROWS_THRESHOLD is available and works as Tim suggested, in our CDH 5.13 cluster.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 12 Jun 2018 23:46:13 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Very-slow-CodeGen-taking-80-of-runtime/m-p/69016#M79143</guid>
      <dc:creator>mauricio</dc:creator>
      <dc:date>2018-06-12T23:46:13Z</dc:date>
    </item>
    <item>
      <title>Re: Very slow CodeGen taking 80% of runtime</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Very-slow-CodeGen-taking-80-of-runtime/m-p/69017#M79144</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/1843"&gt;@mauricio&lt;/a&gt;that's great news! Thanks for the update. We do need to get this documented though.&lt;/P&gt;</description>
      <pubDate>Wed, 13 Jun 2018 00:12:15 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Very-slow-CodeGen-taking-80-of-runtime/m-p/69017#M79144</guid>
      <dc:creator>Tim Armstrong</dc:creator>
      <dc:date>2018-06-13T00:12:15Z</dc:date>
    </item>
  </channel>
</rss>

