<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Some nodes are way slower on HDFS scan then the other ones during impala SQL  query in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Some-nodes-are-way-slower-on-HDFS-scan-then-the-other-ones/m-p/307856#M223371</link>
    <description>&lt;P&gt;Hello guys, we are experiencing slow hdfs scan issue i.e. there a query run on 17 nodes and in query profile some nodes a way slower than the other ones.&lt;BR /&gt;Switching off the slowest node just causes the other node to become the slowest whereas it used to be the average.&lt;/P&gt;&lt;P&gt;We are using &lt;STRONG&gt;CDH 5.10&lt;/STRONG&gt;.&lt;/P&gt;&lt;P&gt;Here are parts of the query profile with problematic nodes. I've taken the slowest and the fastest one:&lt;/P&gt;&lt;TABLE border="1" width="100%"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;&lt;STRONG&gt;&lt;SPAN class="md-plain"&gt;THE SLOWEST NODE&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;STRONG&gt;&lt;SPAN class="md-plain"&gt;THE FASTEST NODE&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="50%"&gt;&lt;P&gt;&lt;SPAN class="md-plain"&gt;Hdfs Read Thread Concurrency Bucket: 0:56.2% 1:36.5% 2:7.299% 3:0% 4:0% 5:0% 6:0% 7:0%&lt;/SPAN&gt; &lt;SPAN class="md-plain"&gt;File Formats: PARQUET/SNAPPY:4025&lt;/SPAN&gt; &lt;SPAN class="md-plain"&gt;AverageHdfsReadThreadConcurrency: 0.51&lt;/SPAN&gt; &lt;SPAN class="md-plain"&gt;AverageScannerThreadConcurrency: 2.72&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="md-plain"&gt;BytesRead: 2.7 GiB&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="md-plain"&gt;BytesReadDataNodeCache: 0 B&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="md-pair-s"&gt;&lt;U&gt;&lt;SPAN class="md-plain"&gt;BytesReadLocal: 2.7 GiB&lt;/SPAN&gt;&lt;/U&gt; &lt;SPAN class="md-plain"&gt;BytesReadRemoteUnexpected: 0 B&lt;/SPAN&gt; &lt;U&gt;&lt;SPAN class="md-plain"&gt;BytesReadShortCircuit: 2.7 GiB&lt;/SPAN&gt;&lt;/U&gt; &lt;SPAN class="md-plain"&gt;DecompressionTime: 4.25s&lt;/SPAN&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="md-pair-s"&gt;&lt;SPAN class="md-plain"&gt;InactiveTotalTime: 0ns&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="md-pair-s"&gt;&lt;SPAN class="md-plain"&gt;MaxCompressedTextFileLength: 0 B&lt;/SPAN&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="md-pair-s"&gt;&lt;SPAN class="md-plain"&gt;NumColumns: 23&lt;/SPAN&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="md-pair-s"&gt;&lt;SPAN class="md-plain"&gt;NumDisksAccessed: 4&lt;/SPAN&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="md-pair-s"&gt;&lt;SPAN class="md-plain"&gt;NumRowGroups: 175&lt;/SPAN&gt; &lt;SPAN class="md-plain"&gt;NumScannerThreadsStarted: 3&lt;/SPAN&gt; &lt;STRONG&gt;&lt;SPAN class="md-plain"&gt;PeakMemoryUsage: 133.9 MiB&lt;/SPAN&gt;&lt;/STRONG&gt; &lt;STRONG&gt;&lt;SPAN class="md-plain"&gt;PerReadThreadRawHdfsThroughput: 65.6 MiB/s&lt;/SPAN&gt;&lt;/STRONG&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="md-pair-s"&gt;&lt;SPAN class="md-plain"&gt;RemoteScanRanges: 0&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="md-pair-s"&gt;&lt;STRONG&gt;&lt;SPAN class="md-plain"&gt;RowBatchQueueGetWaitTime: 1.1m&lt;/SPAN&gt;&lt;/STRONG&gt; &lt;SPAN class="md-plain"&gt;RowBatchQueuePutWaitTime: 0ns&lt;/SPAN&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="md-pair-s"&gt;&lt;U&gt;&lt;SPAN class="md-plain"&gt;RowsRead: 192,701,761&lt;/SPAN&gt;&lt;/U&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="md-pair-s"&gt;&lt;U&gt;&lt;SPAN class="md-plain"&gt;RowsReturned: 18,748&lt;/SPAN&gt;&lt;/U&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="md-pair-s"&gt;&lt;STRONG&gt;&lt;SPAN class="md-plain"&gt;RowsReturnedRate: 294 per second&lt;/SPAN&gt;&lt;/STRONG&gt; &lt;SPAN class="md-plain"&gt;ScanRangesComplete: 175&lt;/SPAN&gt; &lt;SPAN class="md-pair-s md-expand"&gt;&lt;STRONG&gt;&lt;SPAN class="md-plain"&gt;ScannerThreadsInvoluntaryContextSwitches: 24,401&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/SPAN&gt; &lt;SPAN class="md-pair-s "&gt;&lt;STRONG&gt;&lt;SPAN class="md-plain"&gt;ScannerThreadsTotalWallClockTime: 3.1m&lt;/SPAN&gt;&lt;/STRONG&gt; &lt;SPAN class="md-plain"&gt;MaterializeTupleTime(): 1.9m&lt;/SPAN&gt; &lt;SPAN class="md-plain"&gt;ScannerThreadsSysTime: 17.81s&lt;/SPAN&gt; &lt;SPAN class="md-plain"&gt;ScannerThreadsUserTime: 1.6m&lt;/SPAN&gt; &lt;STRONG&gt;&lt;SPAN class="md-plain"&gt;ScannerThreadsVoluntaryContextSwitches: 19,141&lt;/SPAN&gt;&lt;/STRONG&gt; &lt;STRONG&gt;&lt;SPAN class="md-plain"&gt;TotalRawHdfsReadTime(): 41.81s&lt;/SPAN&gt;&lt;/STRONG&gt; &lt;STRONG&gt;&lt;SPAN class="md-plain"&gt;TotalReadThroughput: 39.7 MiB/s&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="md-pair-s"&gt;&lt;SPAN class="md-pair-s "&gt;&lt;SPAN class="md-plain"&gt;TotalTime: 1.1m&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;TD width="50%"&gt;&lt;P&gt;&lt;SPAN class="md-plain"&gt;Hdfs Read Thread Concurrency Bucket: 0:86.84% 1:13.16% 2:0% 3:0% 4:0% 5:0% 6:0% 7:0%&lt;/SPAN&gt; &lt;SPAN class="md-plain"&gt;File Formats: PARQUET/SNAPPY:4209&lt;/SPAN&gt; &lt;SPAN class="md-plain"&gt;AverageHdfsReadThreadConcurrency: 0.13&lt;/SPAN&gt; &lt;SPAN class="md-plain"&gt;AverageScannerThreadConcurrency: 5.92&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="md-plain"&gt;BytesRead: 2.7 GiB&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="md-plain"&gt;BytesReadDataNodeCache: 0 B&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="md-pair-s"&gt;&lt;U&gt;&lt;SPAN class="md-plain"&gt;BytesReadLocal: 2.7 GiB&lt;/SPAN&gt;&lt;/U&gt; &lt;SPAN class="md-plain"&gt;BytesReadRemoteUnexpected: 0 B&lt;/SPAN&gt; &lt;U&gt;&lt;SPAN class="md-plain"&gt;BytesReadShortCircuit: 2.7 GiB&lt;/SPAN&gt;&lt;/U&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="md-pair-s"&gt;&lt;SPAN class="md-plain"&gt;DecompressionTime: 3.50s&lt;/SPAN&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="md-pair-s"&gt;&lt;SPAN class="md-plain"&gt;InactiveTotalTime: 0ns&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="md-pair-s"&gt;&lt;SPAN class="md-plain"&gt;MaxCompressedTextFileLength: 0 B&lt;/SPAN&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="md-pair-s"&gt;&lt;SPAN class="md-plain"&gt;NumColumns: 23&lt;/SPAN&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="md-pair-s"&gt;&lt;SPAN class="md-plain"&gt;NumDisksAccessed: 4&lt;/SPAN&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="md-pair-s"&gt;&lt;SPAN class="md-plain"&gt;NumRowGroups: 183&lt;/SPAN&gt; &lt;SPAN class="md-plain"&gt;NumScannerThreadsStarted: 6&lt;/SPAN&gt; &lt;SPAN class="md-pair-s "&gt;&lt;STRONG&gt;&lt;SPAN class="md-plain"&gt;PeakMemoryUsage: 329.9 MiB&lt;/SPAN&gt;&lt;/STRONG&gt; &lt;STRONG&gt;&lt;SPAN class="md-plain"&gt;PerReadThreadRawHdfsThroughput: 529.1 MiB/s&lt;/SPAN&gt;&lt;/STRONG&gt; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="md-pair-s"&gt;&lt;SPAN class="md-pair-s "&gt;&lt;SPAN class="md-plain"&gt;RemoteScanRanges: 0&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="md-pair-s"&gt;&lt;SPAN class="md-pair-s "&gt;&lt;STRONG&gt;&lt;SPAN class="md-plain"&gt;RowBatchQueueGetWaitTime: 14.64s&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="md-pair-s"&gt;&lt;SPAN class="md-pair-s "&gt;&lt;SPAN class="md-plain"&gt;RowBatchQueuePutWaitTime: 0ns&lt;/SPAN&gt; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="md-pair-s"&gt;&lt;SPAN class="md-pair-s "&gt;&lt;U&gt;&lt;SPAN class="md-plain"&gt;RowsRead: 192,490,029&lt;/SPAN&gt;&lt;/U&gt; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="md-pair-s"&gt;&lt;SPAN class="md-pair-s "&gt;&lt;U&gt;&lt;SPAN class="md-plain"&gt;RowsReturned: 21,148&lt;/SPAN&gt;&lt;/U&gt; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="md-pair-s"&gt;&lt;SPAN class="md-pair-s "&gt;&lt;STRONG&gt;&lt;SPAN class="md-plain"&gt;RowsReturnedRate: 1437 per second&lt;/SPAN&gt;&lt;/STRONG&gt; &lt;SPAN class="md-plain"&gt;ScanRangesComplete: 183&lt;/SPAN&gt; &lt;STRONG&gt;&lt;SPAN class="md-plain"&gt;ScannerThreadsInvoluntaryContextSwitches: 7,158&lt;/SPAN&gt;&lt;/STRONG&gt; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="md-pair-s"&gt;&lt;SPAN class="md-pair-s "&gt;&lt;SPAN class="md-pair-s md-expand"&gt;&lt;STRONG&gt;&lt;SPAN class="md-plain"&gt;ScannerThreadsTotalWallClockTime: 1.9m&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/SPAN&gt; &lt;SPAN class="md-plain"&gt;MaterializeTupleTime(): 1.5m&lt;/SPAN&gt; &lt;SPAN class="md-plain"&gt;ScannerThreadsSysTime: 13.94s&lt;/SPAN&gt; &lt;SPAN class="md-plain"&gt;ScannerThreadsUserTime: 1.2m&lt;/SPAN&gt; &lt;STRONG&gt;&lt;SPAN class="md-plain"&gt;ScannerThreadsVoluntaryContextSwitches: 47,621&lt;/SPAN&gt;&lt;/STRONG&gt; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="md-pair-s"&gt;&lt;SPAN class="md-pair-s "&gt;&lt;STRONG&gt;&lt;SPAN class="md-plain"&gt;TotalRawHdfsReadTime: 5.15s&lt;/SPAN&gt;&lt;/STRONG&gt; &lt;STRONG&gt;&lt;SPAN class="md-plain"&gt;TotalReadThroughput: 143.4 MiB/s&lt;/SPAN&gt;&lt;/STRONG&gt; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="md-pair-s"&gt;&lt;SPAN class="md-pair-s "&gt;&lt;SPAN class="md-plain"&gt;TotalTime: 14.71s&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The other nodes were about 20 seconds on the average and the slowest one was far away from any other.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;As we notice &lt;STRONG&gt;BytesReadLocal&lt;/STRONG&gt; is the same, and it is the same on the rest of the nodes. But &lt;STRONG&gt;PerReadThreadRawHdfsThroughput, RowBatchQueueGetWaitTime, RowsReturnedRate, ScannerThreadsInvoluntaryContextSwitches&lt;/STRONG&gt; are very much different.&lt;/P&gt;&lt;P&gt;The strange part is the same node can be fast in terms of query/throughput until we stop &lt;STRONG&gt;impalad&lt;/STRONG&gt; on the node being the slowest one.&lt;BR /&gt;Then with almost the same amount of data scanned (removing a single node is changing &lt;STRONG&gt;BytesReadLocalis&lt;/STRONG&gt; from e.g. 2.5G to 2.7G ) the speed degrades dramatically like 2 or 3 times worse.&lt;BR /&gt;&lt;BR /&gt;Does anyone have an idea what could be wrong?&lt;BR /&gt;&lt;BR /&gt;Thanks in advance!&lt;/P&gt;</description>
    <pubDate>Wed, 16 Dec 2020 16:42:53 GMT</pubDate>
    <dc:creator>hl-man</dc:creator>
    <dc:date>2020-12-16T16:42:53Z</dc:date>
  </channel>
</rss>

