11-15-2018 07:29 AM
We currently have a number of classes of users leveraging Impala. We're running into issues where some users create queries that exhaust availabe Impala memory and then impact other queries, potentially taking them down as we encounter OOM errors.
Our goal is to leverage a default memory limit per query (potentially across pools) to prevent rogue queries from running unchecked. However, when we set this value to some fraction of total Impala memory per node (say, 128GB) we end up with Impala attempting to RESERVE that amount of memory for each query, which is not the effect we're looking for.
Since we're using resource management (not Llama, but Admission Control), I believe the following is relevant:
When resource management is enabled, the mechanism for this option changes. If set, it overrides the automatic memory estimate from Impala. Impala requests this amount of memory from YARN on each node, and the query does not proceed until that much memory is available. The actual memory used by the query could be lower, since some queries use much less memory than others. With resource management, the MEM_LIMIT setting acts both as a hard limit on the amount of memory a query can use on any node (enforced by YARN) and a guarantee that that much memory will be available on each node while the query is being executed.
I guess my question is, can we use resource management for Impala AND have mem_limit actual be a simple limit-per-query and NOT a reservation-per-query? Do we absolutely need to turn of resource management for Impala if we want mem_limit to behave as a limit and not a reservation? If so, what exactly do we need to turn off? Since Llama is no longer relevant in the Impala universe, the actual setting I'm supposed to toggle is a little obscure. Is it actually Admission Control we need to turn off, or something else?
11-15-2018 07:48 AM
Actually, in re-reading what I wrote, I'm further confused...
When resource management is enabled, the mechanism for this option changes. If set, it overrides the automatic memory estimate from Impala. Impala requests this amount of memory from YARN
We are not using Impala on YARN (ie, Llama). I don't think it's physical possible to use this feature anymore (we're running CDH 5.11). So how can setting the default Impala mem_limit be causing a reservation of memory versus a simple over-the-limit check during execution?
11-15-2018 10:18 AM
11-15-2018 10:49 AM