03-22-2019 10:55 AM
I use Impala UI, under /queries tab, in order to get runtime metrics for queries. I'm a little bit confused about the 'duration' and 'waiting time' metrics. What is being computed and tracked in each of them?
I have also timed how long it takes to run the query (since statement is submitted until all rows are fetched) from the client side (I use Ibis framework to submit queries to Impala) and I can see a much smaller runtime than the one reported in the UI. Why is that? Is perhaps the metrics reported an aggregated computing time across all impalad nodes?
03-25-2019 09:20 AM
What version of Impala are you using? I suspect the meaning of "duration" might have changed in IMPALA-1575 / IMPALA-5397.
In general, its possible that the your definition of duration is different from Impala's. Depending on the version, Impala might include the time taken until the query has been actually closed (which would include fetching rows and releasing all resources).
I *think* the waiting time is the difference between the current time and the time the query was last actively being processed. So this value could be high if the query has been completed, but the client has not closed the query (which is why "waiting time" shows up the section "waiting to be closed").
03-28-2019 10:19 AM
The Impala version I'm using is 2.11, so I have those changes there.
One thing I noticed is the duration, when query is submitted from impala-shell, seems to match the duration of the query reported after all rows have been fetched, but that does not seem to be the case when we actually time the duration in the client that submitted the query.
As you said, probably Impala is still dealing with further processing/closing/cleanup of the query at the time the client was already able to fetch all the results of the query and print out the elapsed time.
Thanks for your answer,