04-30-2018 04:14 PM - edited 04-30-2018 04:54 PM
Getting this message when trying to Download results of a query that has been sitting for a while (probably more than 10 minutes:
Query db41f6674e5688ca:cc35f3ad00000000 expired due to client inactivity (timeout is 10m)
What setting controls how long these results stick around? We have the following set on our impala daemons, which I thought would prevent sessions and results from expiring, but it doesn't appear so:
|idle_session_timeout (int32)||The time, in seconds, that a session may be idle for before it is closed (and all running queries cancelled) by Impala. If 0, idle sessions are never expired.||0||0|
|idle_query_timeout (int32)||The time, in seconds, that a query may be idle for (i.e. no processing work is done and no updates are received from the client) before it is cancelled. If 0, idle queries are never expired. The query option QUERY_TIMEOUT_S overrides this setting, but, if set, --idle_query_timeout represents the maximum allowable timeout.||0||0|
We'd like to keep query results around for a long long time if possible.
Do we have to enable proxy load balancing to get Impala to keep query results accessible for longer times?
Version: Cloudera Enterprise 5.13.0
05-01-2018 10:42 AM - edited 05-01-2018 10:46 AM
The QUERY_TIMEOUT_S also controls the idle query timeout. The default for that can be set as an impalad startup option (--default_query_options='key,value'), or the value can be set via the SET command, a parameter during session creation, or a default attached to an admission control resource pool.
Note that resources used to execute the query will be reserved until all the query results are fetched (since Impala is a streaming engine, final results are computed "on the fly" while results are fetched), so it's good practice to fetch the results promptly, and set a timeout to help free resources in case results are not fetched.
05-01-2018 06:21 PM
I've set QUERY_TIMEOUT_S to 604800 (1 week) in the impala configuration safety valve, restarted impala and hue, but it doesn't seem to matter. My query results show up for a while in hue, then after a few minutes change to 'query failed'.
Looking at the impala queries in CM, they show as
I must be missing something. What is the default behavior for HUE impala queries, do they expire after 10 minutes?
05-02-2018 10:48 AM
There are also --idle_query_timeout and --idle_session_timeout startup flags that set an upper bound on the expiration. They might also be set.
05-02-2018 12:39 PM
Yes, indeed Hue sets its own timeout at the query level. You can disable it via:
Note: we strongly recommend to keep it to the 10min default as in certain cases Hue won't close all the queries sent to Impala. Is the use case that the user executes a long running query he wants to download but then does not get notified of its end and then it is expired when he comes back for downloading it? (https://issues.cloudera.org/browse/HUE-2142 is aimed to improve this experience)
05-02-2018 02:16 PM - edited 05-02-2018 02:16 PM
You pointed us in the right direction. I was setting it in several incorrect places in CM Impala and Hue. The place that finally worked was:
Hue Server Advanced Configuration Snippet (Safety Valve) for hue_safety_valve_server.ini