<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Meaning of RowBatchQueueGetWaitTime metric in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Meaning-of-RowBatchQueueGetWaitTime-metric/m-p/67546#M78588</link>
    <description>&lt;P&gt;Can anyone explain whatis&amp;nbsp;&lt;SPAN&gt;RowBatchQueueGetWaitTime?&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;I'm looking into a slow-ish query that is taking 2 to 3 seconds to do hdfs scan on most nodes and I don't see why it should take that long: 3 or so files per node only a couple K each, cached (and confirmed all read from cache).&amp;nbsp; The only thing that looks odd is this metric.&amp;nbsp;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Here's a sample relevant profile fragment (about same for all executors):&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&amp;gt;&amp;gt;&amp;gt; HDFS_SCAN_NODE (id=0) (1.96s)
Hdfs split stats (&amp;lt;volume id&amp;gt;:&amp;lt;# splits&amp;gt;/&amp;lt;split lengths&amp;gt;): 18:1/69.15 KB 20:2/142.83 KB
ExecOption: PARQUET Codegen Enabled, Codegen enabled: 3 out of 3
Runtime filters: Not all filters arrived (arrived: [1], missing [0]), waited for 352ms
Hdfs Read Thread Concurrency Bucket: 0:100% 1:0% 2:0% 3:0% 4:0% 5:0% 6:0% 7:0% 8:0% 9:0% 10:0% 11:0% 12:0% 13:0% 14:0% 15:0%
File Formats: PARQUET/SNAPPY:156
AverageHdfsReadThreadConcurrency: 0.00
AverageScannerThreadConcurrency: 1.00
BytesRead: 228.0 KiB
BytesReadDataNodeCache: 228.0 KiB
BytesReadLocal: 228.0 KiB
BytesReadRemoteUnexpected: 0 B
BytesReadShortCircuit: 228.0 KiB
CachedFileHandle**bleep**Count: 0
CachedFileHandlesMissCount: 159
DecompressionTime: 188.47us
InactiveTotalTime: 0ns
MaxCompressedTextFileLength: 0 B
NumColumns: 52
NumDictFilteredRowGroups: 0
NumDisksAccessed: 0
NumRowGroups: 3
NumScannerThreadsStarted: 1
NumScannersWithNoReads: 0
NumStatsFilteredRowGroups: 0
PeakMemoryUsage: 499.3 KiB
PerReadThreadRawHdfsThroughput: 0 B/s
RemoteScanRanges: 0
RowBatchQueueGetWaitTime: 1.60s
RowBatchQueuePutWaitTime: 0ns
RowsRead: 426
RowsReturned: 2
RowsReturnedRate: 1 per second
ScanRangesComplete: 3
ScannerThreadsInvoluntaryContextSwitches: 8
ScannerThreadsTotalWallClockTime: 1.89s
MaterializeTupleTime(*): 16ms
ScannerThreadsSysTime: 10ms
ScannerThreadsUserTime: 73ms
ScannerThreadsVoluntaryContextSwitches: 393
TotalRawHdfsReadTime(*): 0ns
TotalReadThroughput: 88.4 KiB/s
TotalTime: 1.96s
&amp;gt;&amp;gt;&amp;gt; Filter 0 (1.00 MB) (0ns)
InactiveTotalTime: 0ns
Rows processed: 0
Rows rejected: 0
Rows total: 426
TotalTime: 0ns
&amp;gt;&amp;gt;&amp;gt; Filter 1 (1.00 MB) (0ns)
InactiveTotalTime: 0ns
Rows processed: 426
Rows rejected: 424
Rows total: 426
TotalTime: 0ns&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Thanks in advance!&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;-m&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Fri, 16 Sep 2022 13:15:40 GMT</pubDate>
    <dc:creator>mauricio</dc:creator>
    <dc:date>2022-09-16T13:15:40Z</dc:date>
    <item>
      <title>Meaning of RowBatchQueueGetWaitTime metric</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Meaning-of-RowBatchQueueGetWaitTime-metric/m-p/67546#M78588</link>
      <description>&lt;P&gt;Can anyone explain whatis&amp;nbsp;&lt;SPAN&gt;RowBatchQueueGetWaitTime?&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;I'm looking into a slow-ish query that is taking 2 to 3 seconds to do hdfs scan on most nodes and I don't see why it should take that long: 3 or so files per node only a couple K each, cached (and confirmed all read from cache).&amp;nbsp; The only thing that looks odd is this metric.&amp;nbsp;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Here's a sample relevant profile fragment (about same for all executors):&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&amp;gt;&amp;gt;&amp;gt; HDFS_SCAN_NODE (id=0) (1.96s)
Hdfs split stats (&amp;lt;volume id&amp;gt;:&amp;lt;# splits&amp;gt;/&amp;lt;split lengths&amp;gt;): 18:1/69.15 KB 20:2/142.83 KB
ExecOption: PARQUET Codegen Enabled, Codegen enabled: 3 out of 3
Runtime filters: Not all filters arrived (arrived: [1], missing [0]), waited for 352ms
Hdfs Read Thread Concurrency Bucket: 0:100% 1:0% 2:0% 3:0% 4:0% 5:0% 6:0% 7:0% 8:0% 9:0% 10:0% 11:0% 12:0% 13:0% 14:0% 15:0%
File Formats: PARQUET/SNAPPY:156
AverageHdfsReadThreadConcurrency: 0.00
AverageScannerThreadConcurrency: 1.00
BytesRead: 228.0 KiB
BytesReadDataNodeCache: 228.0 KiB
BytesReadLocal: 228.0 KiB
BytesReadRemoteUnexpected: 0 B
BytesReadShortCircuit: 228.0 KiB
CachedFileHandle**bleep**Count: 0
CachedFileHandlesMissCount: 159
DecompressionTime: 188.47us
InactiveTotalTime: 0ns
MaxCompressedTextFileLength: 0 B
NumColumns: 52
NumDictFilteredRowGroups: 0
NumDisksAccessed: 0
NumRowGroups: 3
NumScannerThreadsStarted: 1
NumScannersWithNoReads: 0
NumStatsFilteredRowGroups: 0
PeakMemoryUsage: 499.3 KiB
PerReadThreadRawHdfsThroughput: 0 B/s
RemoteScanRanges: 0
RowBatchQueueGetWaitTime: 1.60s
RowBatchQueuePutWaitTime: 0ns
RowsRead: 426
RowsReturned: 2
RowsReturnedRate: 1 per second
ScanRangesComplete: 3
ScannerThreadsInvoluntaryContextSwitches: 8
ScannerThreadsTotalWallClockTime: 1.89s
MaterializeTupleTime(*): 16ms
ScannerThreadsSysTime: 10ms
ScannerThreadsUserTime: 73ms
ScannerThreadsVoluntaryContextSwitches: 393
TotalRawHdfsReadTime(*): 0ns
TotalReadThroughput: 88.4 KiB/s
TotalTime: 1.96s
&amp;gt;&amp;gt;&amp;gt; Filter 0 (1.00 MB) (0ns)
InactiveTotalTime: 0ns
Rows processed: 0
Rows rejected: 0
Rows total: 426
TotalTime: 0ns
&amp;gt;&amp;gt;&amp;gt; Filter 1 (1.00 MB) (0ns)
InactiveTotalTime: 0ns
Rows processed: 426
Rows rejected: 424
Rows total: 426
TotalTime: 0ns&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Thanks in advance!&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;-m&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 13:15:40 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Meaning-of-RowBatchQueueGetWaitTime-metric/m-p/67546#M78588</guid>
      <dc:creator>mauricio</dc:creator>
      <dc:date>2022-09-16T13:15:40Z</dc:date>
    </item>
    <item>
      <title>Re: Meaning of RowBatchQueueGetWaitTime metric</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Meaning-of-RowBatchQueueGetWaitTime-metric/m-p/88205#M78589</link>
      <description>&lt;P&gt;Impala scanners internally have a RowBatch queue that allows Impala to decouple I/O from CPU processing. The I/O threads read data into RowBatches and put them into a queue, CPU threads asynchronously fetch data from the queue and process them.&amp;nbsp;&lt;SPAN&gt;RowBatchQueueGetWaitTime is the amount of time CPU threads wait on data to arrive into the queue. Essentially, it means the CPU threads were waiting a long time for the I/O threads to read the data.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 24 Mar 2019 16:57:26 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Meaning-of-RowBatchQueueGetWaitTime-metric/m-p/88205#M78589</guid>
      <dc:creator>SahilTakiar</dc:creator>
      <dc:date>2019-03-24T16:57:26Z</dc:date>
    </item>
  </channel>
</rss>

