The current version of impala doesnt reliably protect against out-of-memory errors.
In several cases I've seen users run into this out-of-memory scenario unpredictably.
Sometimes its because their queries are inefficient or stats are not computed. Sometimes its because there are multiple queries running in parallel and one must be sacrificed. The point is an out-of memory error is both unpredictable and can happen at any time.
I have some high-priority jobs executing queries via service accounts.
My need is conceptually basic: if/when memory limits are exceeded I want to ensure the queries executed by the service accounts are not killed.
According to the documentation, the default placement rules dictate that users running queries will automatically be placed in resource pools that match their user names. If no such resource pool exists a default one will be created. I created a resource pool for a service account. I run an expensive query under the guise of my service account but the dynamic resource pool management page doesn’t show usage for the query, it states “This table and the charts on the right contain metrics from YARN only.” Which leaves me guessing if this is properly configured or not.
It is hard for me to verify if my configuration/understanding is correct or not. Can I see any screenshots of a properly configured dynamic resource pool? I THINK I have something somewhat correct but the testing for such scenarios is elusive and difficult to prove unless I am missing something.