Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

impala slow query caused by scan node

avatar
Explorer

Most of the same Impala queries respond within 20 seconds. But sometimes it takes more than 60 seconds.

I see a hotspot in 'SCAN HDFS' of 'ExecSummary' in slow query profile. (avgTime 6s462ms, maxTime 54s766ms )

 

low response time Query Info:

 - Duration : 13.1s

 - HDFS Scanner Average Read Throughput: 382.2MiB/s

 - Threads: Network Receive Wait Time: 10.0m

 

high response time Query Info:

 - Duration : 1m, 4s

 - HDFS Scanner Average Read Throughput : 988.1 MiB/s

 - Threads: Network Receive Wait Time: 59.5m

 

If HDFS SCAN is the cause of the slow response time, is the 'HDFS Scanner Average Read Throughput' value higher in the slow query?

 

And one more question.

I compared the fast and slow operators of the HDFS_SCAN_NODE stage in the slow query profile.
RowBatchQueueGetWaitTime(54.76s) and ScannerThreadsTotalWallClockTime(54.77s) I wonder why the two metric values are high.

 

HDFS_SCAN_NODE (id=0)
Hdfs split stats (<volume id>:<# splits>/<split lengths>): 1:1/4.72 MB
ExecOption: PARQUET Codegen Enabled, Codegen enabled: 1 out of 1
Hdfs Read Thread Concurrency Bucket: 0:100% 1:0% 2:0% 3:0% 4:0% 5:0% 6:0% 7:0% 8:0% 9:0% 10:0%
File Formats: PARQUET/SNAPPY:1
- AverageHdfsReadThreadConcurrency: 0.00 (0.0)
- AverageScannerThreadConcurrency: 1.00 (1.0)
- BytesRead: 959.6 KiB (982621)
- BytesReadDataNodeCache: 0 B (0)
- BytesReadLocal: 959.6 KiB (982621)
- BytesReadRemoteUnexpected: 0 B (0)
- BytesReadShortCircuit: 959.6 KiB (982621)
- CachedFileHandlesHitCount: 0 (0)
- CachedFileHandlesMissCount: 3 (3)
- CollectionItemsRead: 0 (0)
- DecompressionTime: 2ms (2131729)
- InactiveTotalTime: 0ns (0)
- MaxCompressedTextFileLength: 0 B (0)
- NumColumns: 2 (2)
- NumDictFilteredRowGroups: 0 (0)
- NumDisksAccessed: 1 (1)
- NumRowGroups: 1 (1)
- NumScannerThreadsStarted: 1 (1)
- NumScannersWithNoReads: 0 (0)
- NumStatsFilteredRowGroups: 0 (0)
- PeakMemoryUsage: 3.3 MiB (3430040)
- PerReadThreadRawHdfsThroughput: 1,011.8 MiB/s (1060904048)
- RemoteScanRanges: 0 (0)
- RowBatchQueueGetWaitTime: 54.76s (54763817619)
- RowBatchQueuePutWaitTime: 0ns (0)
- RowsRead: 52,319 (52319)
- RowsReturned: 52,319 (52319)
- RowsReturnedRate: 955 per second (955)
- ScanRangesComplete: 1 (1)
- ScannerThreadsInvoluntaryContextSwitches: 1 (1)
- ScannerThreadsTotalWallClockTime: 54.77s (54768078433)
- MaterializeTupleTime(*): 1ms (1634590)
- ScannerThreadsSysTime: 1ms (1897000)
- ScannerThreadsUserTime: 9ms (9870000)
- ScannerThreadsVoluntaryContextSwitches: 14 (14)
- TotalRawHdfsOpenFileTime(*): 3ms (3129744)
- TotalRawHdfsReadTime(*): 926.21us (926211)
- TotalReadThroughput: 10.8 KiB/s (11075)
- TotalTime: 54.77s (54766111629)

HDFS_SCAN_NODE (id=0)
Hdfs split stats (<volume id>:<# splits>/<split lengths>): 3:1/4.70 MB
ExecOption: PARQUET Codegen Enabled, Codegen enabled: 1 out of 1
Hdfs Read Thread Concurrency Bucket: 0:0% 1:0% 2:0% 3:0% 4:0% 5:0% 6:0% 7:0% 8:0% 9:0% 10:0%
File Formats: PARQUET/SNAPPY:1
- AverageHdfsReadThreadConcurrency: 0.00 (0.0)
- AverageScannerThreadConcurrency: 0.00 (0.0)
- BytesRead: 956.9 KiB (979887)
- BytesReadDataNodeCache: 0 B (0)
- BytesReadLocal: 956.9 KiB (979887)
- BytesReadRemoteUnexpected: 0 B (0)
- BytesReadShortCircuit: 956.9 KiB (979887)
- CachedFileHandlesHitCount: 0 (0)
- CachedFileHandlesMissCount: 3 (3)
- CollectionItemsRead: 0 (0)
- DecompressionTime: 1ms (1175394)
- InactiveTotalTime: 0ns (0)
- MaxCompressedTextFileLength: 0 B (0)
- NumColumns: 2 (2)
- NumDictFilteredRowGroups: 0 (0)
- NumDisksAccessed: 1 (1)
- NumRowGroups: 1 (1)
- NumScannerThreadsStarted: 1 (1)
- NumScannersWithNoReads: 0 (0)
- NumStatsFilteredRowGroups: 0 (0)
- PeakMemoryUsage: 3.2 MiB (3392266)
- PerReadThreadRawHdfsThroughput: 1.1 GiB/s (1200419705)
- RemoteScanRanges: 0 (0)
- RowBatchQueueGetWaitTime: 8ms (8679355)
- RowBatchQueuePutWaitTime: 0ns (0)
- RowsRead: 52,115 (52115)
- RowsReturned: 52,115 (52115)
- RowsReturnedRate: 4868064 per second (4868064)
- ScanRangesComplete: 1 (1)
- ScannerThreadsInvoluntaryContextSwitches: 0 (0)
- ScannerThreadsTotalWallClockTime: 11ms (11603409)
- MaterializeTupleTime(*): 1ms (1465282)
- ScannerThreadsSysTime: 2ms (2006000)
- ScannerThreadsUserTime: 6ms (6044000)
- ScannerThreadsVoluntaryContextSwitches: 14 (14)
- TotalRawHdfsOpenFileTime(*): 2ms (2488801)
- TotalRawHdfsReadTime(*): 816.29us (816287)
- TotalReadThroughput: 0 B/s (0)
- TotalTime: 10ms (10705486)

1 ACCEPTED SOLUTION

avatar
Cloudera Employee
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
4 REPLIES 4

avatar
Cloudera Employee

All other evidence being equal this scenario suggests very possible disk level problem on the slow host. You could try brute force and restart the Impala Daemon on that host or do some light diagnostics, like hdparm or fio, depending on how much time you want to invest.
host one
- RowsReturned: 52,115 (52115)
- RowsReturnedRate: 4868064 per second (4868064)
host 2
- RowsReturned: 52,319 (52319)
- RowsReturnedRate: 955 per second (955)

 

If these obvious disk checks don't show up anything useful , send full query profile of a fast query and the slow.

avatar
Explorer

@clev 

Is 'RowsReturnedRate' data read from disk?

Compared to the fast 'HDFS Scanner Average Read Throughput' value of 988.1 MiB/s, the 'Threads: Network Receive Wait Time' value was slow at 59.5m, so I did not think it was disk I/O.
Don't you think it's the process of quickly reading from disk and sending it over the network?

 

avatar
Cloudera Employee
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar
Community Manager

@coco Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. Thanks


Regards,

Diana Torres,
Community Moderator


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community: