This is a continuation of a previous post titled "Impala Queries Executing long time" in 2017 in which the Cloudera employee explained why query would appear running but not actually be running. In summary, he said the paging function in Hue will leave the query open and appear to be running.
He also said to fix this you need to set some parameters that will force a timeout to occur:
"You can also set idle query timeout and idle session timeout in impala advance snippet to force timeout for queries running from hue."
Unfortunately, this is not true as the state of the query is not in an exact state that it requires for the timeout to occur.
In our case, we have the following settings in Impala:
This is set in a field with the label:
"Impala Daemon Command Line Argument Advanced Configuration Snippet (Safety Valve)"
I just killed a query that appeared to be running for 15 hours. The interesting thing about that query is, it had a LIMIT 50 clause.
I can't imagine 50 records taking 15 hours in most scenarios...
We regularly have queries exceed our verbal agreement of 1 hour before we kill the job.
The question is rather obvious but, since I know these settings do not work in this scenario and maybe something similar, not sure, what will stop Hue from holding the resources for so long?
When Hue executes a query, it leaves the query open so that users can page through results at their own pace. (Of course, this behavior isn't very useful for DDL statements.) That means that Impala still considers the query to be executing, even if it is not actively using CPU cycles (keep in mind it is still holding memory!). Hue will close the query if explicitly told to, or when the page/session is closed, e.g. using the hue command:
> build/env/bin/hue close_queries --help
Note that Impala has a query option to automatically 'timeout' queries after a period of time, see query_timeout_s. Hue sets this to 10 minutes by default, but you can override it in the hue.ini settings.
One thing to note is that when queries 'time out', they are cancelled but not closed, i.e. the query will remain "in flight" with a CANCELLED status. The reason for this is so that users (or tools) can continue to observe the query metadata (e.g. query profile, status, etc.), which would not be available if the query is fully closed and thus deregistered from the impalad. Unfortunately these cancelled queries may still hold some non-negligible resources, but this will be fixed with IMPALA-1575.
More information: Hive and Impala queries life cycle
I see memory being held, I believe in the CM Impala query interface along with a supposedly running query. I assume that if I see a query still saying it is executing that although it's just waiting to send the next page of results that memory is being held for sure.
First, the query_timeout_s parameter has not been set, so if it's 10 minutes or 5 minutes like what our current configuration says, it's still not working:
# Hue will try to close the Impala query when the user leaves the editor page.
# This will free all the query resources in Impala, but also make its results inaccessible.
# If > 0, the query will be timed out (i.e. cancelled) if Impala does not do any work
# (compute or send back results) for that query within QUERY_TIMEOUT_S seconds.
# If > 0, the session will be timed out (i.e. cancelled) if Impala does not do any work
# (compute or send back results) for that session within QUERY_TIMEOUT_S seconds (default 1 hour).
This parameter is having no effect, correct? Whatever settings are being used are not working as I can see many related settings but none of them are set to allow a query to be active for 15 hours.
Also, what about the # close_queries=true option? Will that do what we need?
However, I have a question about this parameter... Will this kill the query and release the results if we set it to an hour as we need it to be?
# Users will automatically be logged out after 'n' seconds of inactivity.
# A negative number means that idle sessions will not be timed out.
If not, any parameters that we haven't talked about that could stop this long retention period of resources?
BTW, I could not find this command. Where should I look?
build/env/bin/hue close_queries --help
I could write a script theoretically that would check query duration to see if any are long running and kill them if need be. I know Hue has no API as I have experienced. I wrote a Python app that took a list of users and removed them automatically from CM and Hue. I had to use the Requests module in Python to load the existing users and bounce that off the users to be deleted. Then, create a POST request that deleted the existing users that have left the company. Quite painful but fun...
Just so you guys know, I'm here on this forum because all the documentation I read says the 2 parameters we have set should close the queries and return the resources after the timeout. Neither one of the 2 works. So documentation is not accurate or something else is wrong.