Member since
06-03-2016
2
Posts
0
Kudos Received
0
Solutions
06-03-2016
05:14 AM
We already shared exception from Impala logs of affected node. But "ThriftServer" log came about 90 minutes later when node connection was lost. There is no other error/exception in impala log. Please let me know if there is some other log we can check into. This event happened multiple times. In one node the service is down but we still see process running. See attachment. Even kill -9 pid is not honoring. Also when one node is impacted my application running on Weblogic 11g looses all connection and we had to restart application servers for getting connection back. This is becoming annoying. Please let me know what we can do to get to root cause of problem.
... View more
06-03-2016
03:56 AM
1- When we started impala service it started fine. Please see statestored.info file. I have removed my actual server name. And it was working fine. Suddenly on 02-Jun 07:53:09 it was unable to send heartbeat. And at 9:30 we see error I posted earlier about ThriftServer. 2- so nothing was running on port 22000. namenode->statestored.info I0506 14:46:35.929142 4306 statestore.cc:370] Registering: impalad@<node name which got impacted>:22000 I0506 14:46:35.929231 4306 statestore.cc:393] Subscriber 'impalad@<node name which got impacted>:22000' registered (registration id: fc490552ddd59468:113b9a75e4b55787) I0602 07:53:09.326907 4153 statestore.cc:690] Unable to send heartbeat message to subscriber impalad@<node name which got impacted>:22000, received error: RPC timed out I0602 07:53:13.325565 4153 statestore.cc:690] Unable to send heartbeat message to subscriber impalad@<node name which got impacted>:22000, received error: RPC timed out I0602 07:53:17.321043 4153 statestore.cc:690] Unable to send heartbeat message to subscriber impalad@<node name which got impacted>:22000, received error: RPC timed out I0602 07:53:21.320971 4153 statestore.cc:690] Unable to send heartbeat message to subscriber impalad@<node name which got impacted>:22000, received error: RPC timed out I0602 07:53:25.320952 4153 statestore.cc:690] Unable to send heartbeat message to subscriber impalad@<node name which got impacted>:22000, received error: RPC timed out I0602 07:53:29.320972 4153 statestore.cc:690] Unable to send heartbeat message to subscriber impalad@<node name which got impacted>:22000, received error: RPC timed out I0602 07:53:33.320968 4153 statestore.cc:690] Unable to send heartbeat message to subscriber impalad@<node name which got impacted>:22000, received error: RPC timed out I0602 07:53:37.320973 4153 statestore.cc:690] Unable to send heartbeat message to subscriber impalad@<node name which got impacted>:22000, received error: RPC timed out I0602 07:53:41.316970 4153 statestore.cc:690] Unable to send heartbeat message to subscriber impalad@<node name which got impacted>:22000, received error: RPC timed out I0602 07:53:45.316970 4153 statestore.cc:690] Unable to send heartbeat message to subscriber impalad@<node name which got impacted>:22000, received error: RPC timed out I0602 07:53:49.316972 4153 statestore.cc:690] Unable to send heartbeat message to subscriber impalad@<node name which got impacted>:22000, received error: RPC timed out I0602 07:53:49.316988 4153 statestore.cc:702] Subscriber 'impalad@<node name which got impacted>:22000' has failed, disconnected or re-registered (last known registration ID: impal$ I0602 10:09:16.522661 17361 statestore.cc:370] Registering: impalad@<node name which got impacted>:22000 I0602 10:09:16.522711 17361 statestore.cc:393] Subscriber 'impalad@<node name which got impacted>:22000' registered (registration id: f45a9be0d38a32c:63a32de772688380)
... View more
06-03-2016
01:29 AM
Hi, We are facing the same exception on a running , on one of the nodes. We are running on Impala version 2.3. As per above update this issue should have resolved in Impala 1.4. Please help as it is impacting stability of cluster. All nodes stops responding even though problem is only with one node. Exception: E0602 09:30:30.544390 18138 logging.cc:120] stderr will be logged to this file. E0602 09:30:39.605379 18220 thrift-server.cc:160] ThriftServer 'backend' (on port: 22000) exited duelp e to TException: Could not $ E0602 09:30:39.605533 18138 thrift-server.cc:149] ThriftServer 'backend' (on port: 22000) did not start correctly E0602 09:30:39.605597 18138 impalad-main.cc:76] ThriftServer 'backend' (on port: 22000)
... View more