We recently and installed HAProxy for a load balancer for Impala. We have 5 worker/data nodes that have the impala daemon running.
After four days....when I look at the workload summary, i can only see 4 coordinators/nodes that have any queries. There is one 1 node that doesn't appear to be serving/accepting queries...but the health on the node/role is good.
Also, there doesn't seem to be an even distribution of queries on the other nodes. One has run over 50%, one 30% and then the other two about 10% each.
Is there any way for us to identify the issues:
1. Why doesn't the fifth node run/accept queries
2. why are the distributions so far off???