About clev

clev · ‎10-04-2022

I think Network time is a symptom of slow I/O...Note the RowBatchQueueGetWaitTime, between the two scans...8ms in your fast scan, the slow is 56seconds. Notes onRowBatchQueueGetWaitTime: Impala scanners internally have a RowBatch queue that allows Impala to decouple I/O from CPU processing. The I/O threads read data into RowBatches and put them into a queue, CPU threads asynchronously fetch data from the queue and process them. RowBatchQueueGetWaitTime is the amount of time CPU threads wait on data to arrive into the queue. Essentially, it means the CPU threads were waiting a long time for the I/O threads to read the data.

clev · ‎10-03-2022

All other evidence being equal this scenario suggests very possible disk level problem on the slow host. You could try brute force and restart the Impala Daemon on that host or do some light diagnostics, like hdparm or fio, depending on how much time you want to invest. host one - RowsReturned: 52,115 (52115) - RowsReturnedRate: 4868064 per second (4868064) host 2 - RowsReturned: 52,319 (52319) - RowsReturnedRate: 955 per second (955) If these obvious disk checks don't show up anything useful , send full query profile of a fast query and the slow.

Online	Offline
Last Visited	‎10-10-2023 04:26 PM

Member Since	‎02-28-2016 11:40 AM
Last Visited	‎10-10-2023 04:26 PM
Posts	5

Cloudera Community

Re: impala slow query caused by scan node

Re: impala slow query caused by scan node

Re: impala slow query caused by scan node