02-07-2019 09:47 AM
I have an issue with admission control. During heavy loads on Impala I have many queries in "CREATED" (waiting state) in a particular pool but my measurements indicates that the Impala daemon is not running on Max speed - i.e. not hitting the max concurrency.
The setup is:
One application submitting all the impala queries to one of the total of 10 daemons.
The impala has three pools, small medium and large. The large has 3 concurrency, the medium 6 and small 9.
During the heavy load I made snapshots of currently running queries and I have found out, that the large pool was utilized 100% i.e. 3 queries were in running in a given timeframe, and the rest in "CREATED" state.
But then I checked the snapshot for queries submitted to medium pool, I have found out that only two out of 10 queries were running - and there should be 6 running. So all the 8 queries in medium pool was sitting for 15min+ in CREATED state and then slowly one by one finished.
The small pool does not have a problem, those queries are so quick that they never hit 9 concurrency.
Could somehow Impala throttle the overall number of concurrent queries? Why the biggest pool is correctly utilized and the medium pool just partially? Could be the reason of this limit that every query is handled by one Impala Daemon? If yes, then could HA Proxy help to solve this issue?
I have also thought about that maybe the Impala daemon is not reporting correctly the query state (that in the Web UI it appears as CREATED but maybe it is already running?)
My measurements were made from 1-minute snapshots from the daemon's host:25000/queries page.
Thanks for the tips.
02-07-2019 04:02 PM
The observability should get better in CDH6.1 - queued queries are cancellable and have more information available in the profile about the cause of being queued.
There isn't an aggregate limit on concurrency across pools. There is a limit on the number of connections to each impalad - --fe_service_threads. But since the queries were submitted, that proves that the client was at least connected!!
One possibility is if you have "Maximum memory" set on your resource pools, memory-based admission control may be limiting admission based on available memory.
Another possibility is that the queries in the CREATED state aren't queued in admission control, but are rather in planning, e.g. blocked waiting to load metadata.
Do you have any profiles from the queries that were queued for a long time?
02-07-2019 11:58 PM
My answers: the pools are sliced by memory and the concurrency is calculated form the max memory per node. So for example the medium pool have 120GB(10nodes 12GB per node) and the default query memory limit is 2G (so 20GB overall) thus the max running queries is 6. The application itself does not change the memory limit through set, so I assume all the submitted query are running in this case at most 2G per node, so 6x2GB is 12GB per node.
This observation was made during heavy load on the cluster.
I just tested today it on a different condition, when almost nothing ran, and the Impala was able to put all those 6 queries into running state. The test pack contained selected production queries, and it was finished so quickly (under 2 minutes instead of 10-15+ minutes) that I was barely able to see any queue.
So could it be the conclusion that if the node (the coordinator) is under heavy stress, that placement/planning is slowed down significantly?
And if this occurs again, how can I collect the profiles after the query is finished? The profile seems to change during the executions (latencies, record counts) so how can I "catch" the latest version?