Created on 12-27-2018 06:00 PM - edited 09-16-2022 07:01 AM
We have CDH 5.15 deployed and use impala for analitic batch jobs.
During using impala we found that even a very simple impala job cost a lot of time to finish.
For example , we issue a "select count(*) from shdata.s76_bat_mg_biz_data" ,it runs about 4.8 hours.
In query detail we found that in query timeline the unregister query is 4.8h, while all other steps are very fast (in ms). How can we fix this issue to better use the system?
Created 12-29-2018 09:51 PM
Created 12-28-2018 07:54 AM
It's unlikely that the query is executing that long. Most likely the client you are using is delayed in closing the query.
Created 12-28-2018 11:58 PM
Created 12-29-2018 12:50 AM
Yes , we use hue as a query interface very often.
What we are concern about is that if the query running in hue last so long , will it occupy the concurrency we have in impala since we have admision control ?
Created 12-29-2018 09:51 PM
Created 12-31-2018 01:54 PM
On CDH5.15 in most cases they won't hold onto resources in admission control, unless the query isn't cancelled and the client (i.e. Hue) doesn't fetch all of the results.
Enabling the timeouts suggested by Eric helps ensure that queries get cancelled in timely manner
Created 01-24-2019 10:37 PM
I've set the timeout=30, but it seems got no effect.
And in impalad /sessions, I've found the query session's idle timeout(s) is 1800.
In /varz both idle session and query timeout are 30.
Created 08-16-2019 07:57 AM
It's a bit late in the game but I'm running into the same problem where the query appears to be running for hours and the first row fetched in in seconds. This means it is not actually running although the list of Impala queries says it is. As previously stated by a poster, the user did not close the session. I have just noticed that the last query executed will hold the query in a running state. Once another query is executed or the session closed, it will release the resources and mark the query as finished.
I have another post about this same issue. Neither one of these 2 parameters have helped:
-idle_session_timeout=1500
-idle_query_timeout=1500
So, my conclusion is the documentation is not accurate in what it says about these parameters or there's a bug as of 08/16/2019???
If you find out how to close a session on a query automatically using a parameter, let me know...
Created 08-16-2019 10:30 AM
@pollardthe documentation is accurate, many people use those flags successfully. I wouldn't want to speculate about what's happening in your case. If you include a query profile that can help to diagnose.
We've seen things like this happen when there's a client polling the query for status and keeping it alive (the timeout is since the last time the client performed an operation on the query or session).
Created 08-16-2019 10:30 AM
This can also happen if the query is returning a lot of rows, or if the client is very slow at fetching rows.