Support Questions
Find answers, ask questions, and share your expertise

CLOSE_WAIT status choking Impala connection over 21050 port

Hi Community,

We are using AWS Network Load Balancer to balance out the traffic between 6 impala daemons. 

Recently we started facing issues where 2 Impala daemons won't receive query and just hangs the connection over port 21050. 

On further investigation we found that there were around 5k approx CLOSE_WAIT statuses for the connection between the LB and Impala daemon. 

 

tcp        1      0 10.XXX.XXX.68:21050     10.XXX.XXX.80:64135     CLOSE_WAIT  11084/impalad       
tcp        1      0 10.XXX.XXX.68:21050     10.XXX.XXX.80:58169     CLOSE_WAIT  11084/impalad       
tcp        1      0 10.XXX.XXX.68:21050     10.XXX.XXX.80:64652     CLOSE_WAIT  11084/impalad       
tcp        1      0 10.XXX.XXX.68:21050     10.XXX.XXX.80:62075     CLOSE_WAIT  11084/impalad       
tcp        1      0 10.XXX.XXX.68:21050     10.XXX.XXX.80:52393     CLOSE_WAIT  11084/impalad       
tcp        1      0 10.XXX.XXX.68:21050     10.XXX.XXX.80:41447     CLOSE_WAIT  11084/impalad       
tcp        1      0 10.XXX.XXX.68:21050     10.XXX.XXX.80:47034     CLOSE_WAIT  11084/impalad       
tcp        1      0 10.XXX.XXX.68:21050     10.XXX.XXX.80:49452     CLOSE_WAIT  11084/impalad       
tcp        1      0 10.XXX.XXX.68:21050     10.XXX.XXX.80:28327     CLOSE_WAIT  11084/impalad       
tcp        1      0 10.XXX.XXX.68:21050     10.XXX.XXX.80:52498     CLOSE_WAIT  11084/impalad       
tcp        1      0 10.XXX.XXX.68:21050     10.XXX.XXX.80:21168     CLOSE_WAIT  11084/impalad       
tcp        1      0 10.XXX.XXX.68:21050     10.XXX.XXX.80:40079     CLOSE_WAIT  11084/impalad       
tcp        1      0 10.XXX.XXX.68:21050     10.XXX.XXX.80:35664     CLOSE_WAIT  11084/impalad       
tcp        1      0 10.XXX.XXX.68:21050     10.XXX.XXX.80:4191      CLOSE_WAIT  11084/impalad       
tcp        1      0 10.XXX.XXX.68:21050     10.XXX.XXX.80:14935     CLOSE_WAIT  11084/impalad       
tcp        1      0 10.XXX.XXX.68:21050     10.XXX.XXX.80:63036     CLOSE_WAIT  11084/impalad       
tcp        1      0 10.XXX.XXX.68:21050     10.XXX.XXX.80:5158      CLOSE_WAIT  11084/impalad       
tcp        1      0 10.XXX.XXX.68:21050     10.XXX.XXX.80:60134     CLOSE_WAIT  11084/impalad  

 

Everytime I restart the daemons, the CLOSE_WAIT disappears and the connection establishes successfully but after few minutes these CLOSE_WAIT statuses piles up and chokes the connection again.

Our cluster is Kerberized and TLS-SSL enabled. The NLB is internal and there is only one user using it through JDBC driver.

 

I'm stuck with this issue for over a week now, any suggestion will be very much helpful.

 

Thank you.

3 REPLIES 3

Hi, what was the fix for this issue ? 

Community Manager

@MahendraDevu As this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question. You can link this thread as a reference in your new post. Thanks!


Regards,

Diana Torres,
Community Moderator


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:

Hi,

For me, changing the load balancer's listener protocol and target group's protocol to TCP did the trick.

 

If that does not work out for you, can you please put some more details about your setup so that the  community can help?

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.