This is a continuation of a previous post titled "Impala Queries Executing long time" in 2017 in which the Cloudera employee explained why query would appear running but not actually be running. In summary, he said the paging function in Hue will leave the query open and appear to be running.
He also said to fix this you need to set some parameters that will force a timeout to occur:
"You can also set idle query timeout and idle session timeout in impala advance snippet to force timeout for queries running from hue."
Unfortunately, this is not true as the state of the query is not in an exact state that it requires for the timeout to occur.
In our case, we have the following settings in Impala:
When Hue executes a query, it leaves the query open so that users can page through results at their own pace. (Of course, this behavior isn't very useful for DDL statements.) That means that Impala still considers the query to be executing, even if it is not actively using CPU cycles (keep in mind it is still holding memory!). Hue will close the query if explicitly told to, or when the page/session is closed, e.g. using the hue command:
> build/env/bin/hue close_queries --help
Note that Impala has a query option to automatically 'timeout' queries after a period of time, seequery_timeout_s. Hue sets this to 10 minutes by default, but you can override it in the hue.ini settings.
One thing to note is that when queries 'time out', they arecancelledbut notclosed, i.e. the query will remain "in flight" with aCANCELLEDstatus. The reason for this is so that users (or tools) can continue to observe the query metadata (e.g. query profile, status, etc.), which would not be available if the query is fullyclosedand thus deregistered from the impalad. Unfortunately these cancelled queries may still hold some non-negligible resources, but this will be fixed withIMPALA-1575.
I see memory being held, I believe in the CM Impala query interface along with a supposedly running query. I assume that if I see a query still saying it is executing that although it's just waiting to send the next page of results that memory is being held for sure.
First, the query_timeout_s parameter has not been set, so if it's 10 minutes or 5 minutes like what our current configuration says, it's still not working:
# Hue will try to close the Impala query when the user leaves the editor page. # This will free all the query resources in Impala, but also make its results inaccessible. ## close_queries=true
# If > 0, the query will be timed out (i.e. cancelled) if Impala does not do any work # (compute or send back results) for that query within QUERY_TIMEOUT_S seconds. ## query_timeout_s=600
# If > 0, the session will be timed out (i.e. cancelled) if Impala does not do any work # (compute or send back results) for that session within QUERY_TIMEOUT_S seconds (default 1 hour). ## session_timeout_s=3600
This parameter is having no effect, correct? Whatever settings are being used are not working as I can see many related settings but none of them are set to allow a query to be active for 15 hours.
Also, what about the # close_queries=true option? Will that do what we need?
However, I have a question about this parameter... Will this kill the query and release the results if we set it to an hour as we need it to be?
# Users will automatically be logged out after 'n' seconds of inactivity. # A negative number means that idle sessions will not be timed out. idle_session_timeout=-1
If not, any parameters that we haven't talked about that could stop this long retention period of resources?
BTW, I could not find this command. Where should I look?
build/env/bin/hue close_queries --help
I could write a script theoretically that would check query duration to see if any are long running and kill them if need be. I know Hue has no API as I have experienced. I wrote a Python app that took a list of users and removed them automatically from CM and Hue. I had to use the Requests module in Python to load the existing users and bounce that off the users to be deleted. Then, create a POST request that deleted the existing users that have left the company. Quite painful but fun... 🙂
Just so you guys know, I'm here on this forum because all the documentation I read says the 2 parameters we have set should close the queries and return the resources after the timeout. Neither one of the 2 works. So documentation is not accurate or something else is wrong.