Support Questions

Find answers, ask questions, and share your expertise

Understanding: How does impala handles concurrent queries?

Explorer

My cluster configuration is as follows: 

  1. 3 Node cluster
  2. 128GB RAM per cluster node.
  3. Processor: 16 core HyperThreaded per cluster node.


My questions are:

1) If I submit 5 queries how does impala executes them concurrently? (How does Impala distribute the load in the cluster?)

2) If the number of queries is increased above 10 How will the load balancing of query execution happen?

 

I am observing an unexpected behaviour of impala if I am trying to execute concurrent queries at a time. So just want to understand the distribution of queries and resources in Impala?

4 REPLIES 4

Guru
1) whoever reaches impala first, will take necessary resources. And if resources are occupied, subsequent queries will fail with OOM errors, assuming that Admission Control is not enabled

2) when you start impala-shell, it will connect to one impala daemon, which we call it coordinator, it is responsible for query planning and decide how to distribute the job to other impala daemons. How to distribute jobs depends on the job itself. Again, if you push impala to its limit, queries will start to fail with OOM error without Admission Control.

With Admission Control, impala queries will wait if resources are not enough to run further queries, with a specified timeout value.

Hope that answers some, if not all of your questions.

Explorer

Hi @EricL

Thank you for your response. I want to know how impala distribute the resources if it has multiple concurrent queries to execute. In my case, I am facing a slow down in the query performance in case of concurrent execution. If there any property or something that i need to do to get high concurrent performance from  Impala.

 

Regards,

Bishnu

New Contributor

Hi,

If you dont have setup load balancer, then you need to setup that so that once you run the queries it will go to load balancer then it will distribute across all the nodes based on the available resources, so in this case load balancer would take the responsibility.

Regards

Satish

Explorer

Hi @Burle

  can you suggest me some load balancer that can check Impala health or available resources and based on that it distributes the load to the Impala nodes.

 

Regards,

Bishnu