<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Some nodes are way slower on HDFS scan then the other ones during impala SQL  query in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Some-nodes-are-way-slower-on-HDFS-scan-then-the-other-ones/m-p/307885#M223383</link>
    <description>&lt;P&gt;One difference is how fast it's reading from disk - i.e. &lt;SPAN class="md-pair-s"&gt;&lt;SPAN class="md-pair-s "&gt;&lt;SPAN class="md-plain"&gt;TotalRawHdfsReadTime&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;. In CDH5.12 that includes both time spend fetching metadata from the HDFS namenode and actually reading the data off disk. If you're saying that it's only slow on one node, that probably rules out HDFS namenode slowness, which is a common cause. So probably it's actually slower doing the I/O. Note: in CDH5.15 we split out the namenode RPC time into &lt;SPAN class="md-pair-s"&gt;&lt;SPAN class="md-pair-s "&gt;&lt;SPAN class="md-plain"&gt;TotalRawHdfsOpenTime to make it easier to debug things like this.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;I don't know exactly why I/O would be slower on that one node, it might require inspecting the host to see what's happening and if there's more CPU or I/O load on that host.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;We've seen that happen if a node is more heavily loaded than other nodes because of some kind of uneven data distribution. E.g. one file is very frequently accessed, maybe if there's a dimension table that is referenced in many queries. That can sometimes be addressed by setting&amp;nbsp;SCHEDULE_RANDOM_REPLICA as a query hint or query option &lt;A href="https://docs.cloudera.com/documentation/enterprise/latest/topics/impala_hints.html" target="_blank" rel="noopener"&gt;https://docs.cloudera.com/documentation/enterprise/latest/topics/impala_hints.html&lt;/A&gt; or &lt;A href="https://docs.cloudera.com/documentation/enterprise/latest/topics/impala_schedule_random_replica.html" target="_blank" rel="noopener"&gt;https://docs.cloudera.com/documentation/enterprise/latest/topics/impala_schedule_random_replica.html.&lt;/A&gt;&lt;BR /&gt;Or even by enabling HDFS caching for the problematic table (HDFS caching spreads load across all cached replicas).&lt;BR /&gt;&lt;BR /&gt;Another possible cause, based on that profile, is that it's competing for scanner threads with other queries running on the same node - &lt;SPAN class="md-plain"&gt;AverageScannerThreadConcurrency&lt;/SPAN&gt; is lower in the slow case. This can either be because other concurrent queries grabbed scanner threads first (there's a global soft limit of 3x # cpus per node) or because&lt;/P&gt;</description>
    <pubDate>Wed, 16 Dec 2020 21:47:25 GMT</pubDate>
    <dc:creator>Tim Armstrong</dc:creator>
    <dc:date>2020-12-16T21:47:25Z</dc:date>
  </channel>
</rss>

