Support Questions
Find answers, ask questions, and share your expertise

Resource manager was getting stopped automatically

Highlighted

Resource manager was getting stopped automatically

Explorer

My resource manager was getting stopped automatically after some time when we started using ambari dashboard or command prompt.

2016-07-27 13:36:44,294 INFO  zookeeper.ZooKeeper (ZooKeeper.java:<init>(438)) - Initiating client connection, connectString=node3.exp.himalaya.xyz.io:2181,node5.exp.himalaya.xyz.io:2181,node6.exp.himalaya.xyz.io:2181 sessionTimeout=10000 watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@77091c2f
2016-07-27 13:36:44,315 INFO  zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server node5.exp.himalaya.xyz.io/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2016-07-27 13:36:44,321 INFO  zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(864)) - Socket connection established to node5.exp.himalaya.xyz.io/127.0.0.1:2181, initiating session
2016-07-27 13:36:47,657 INFO  zookeeper.ClientCnxn (ClientCnxn.java:run(1140)) - Client session timed out, have not heard from server in 3335ms for sessionid 0x0, closing socket connection and attempting reconnect
2016-07-27 13:36:48,229 INFO  zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server node3.exp.himalaya.xyz.io/127.0.0.249:2181. Will not attempt to authenticate using SASL (unknown error)
2016-07-27 13:36:48,230 INFO  zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(864)) - Socket connection established to node3.exp.himalaya.xyz.io/127.0.0.249:2181, initiating session
2016-07-27 13:36:48,230 INFO  zookeeper.ClientCnxn (ClientCnxn.java:run(1142)) - Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect
2016-07-27 13:36:48,379 INFO  zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server node6.exp.himalaya.xyz.io/127.0.0.8:2181. Will not attempt to authenticate using SASL (unknown error)
2016-07-27 13:36:48,379 INFO  zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(864)) - Socket connection established to node6.exp.himalaya.xyz.io/127.0.0.8:2181, initiating session
2016-07-27 13:36:48,381 WARN  zookeeper.ClientCnxn (ClientCnxn.java:run(1146)) - Session 0x0 for server node6.exp.himalaya.xyz.io/127.0.0.8:2181, unexpected error, closing socket connection and attempting reconnect
java.io.IOException: Connection reset by peer
        at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
        at sun.nio.ch.IOUtil.read(IOUtil.java:192)
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:384)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1125)
2016-07-27 13:36:50,266 INFO  zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server node5.exp.himalaya.xyz.io/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2016-07-27 13:36:50,267 INFO  zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(864)) - Socket connection established to node5.exp.himalaya.xyz.io/127.0.0.1:2181, initiating session
2016-07-27 13:36:53,604 INFO  zookeeper.ClientCnxn (ClientCnxn.java:run(1140)) - Client session timed out, have not heard from server in 3337ms for sessionid 0x0, closing socket connection and attempting reconnect
2016-07-27 13:36:54,252 INFO  zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server node3.exp.himalaya.xyz.io/127.0.0.249:2181. Will not attempt to authenticate using SASL (unknown error)
2016-07-27 13:36:54,252 INFO  zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(864)) - Socket connection established to node3.exp.himalaya.xyz.io/127.0.0.249:2181, initiating session
2016-07-27 13:36:54,252 INFO  zookeeper.ClientCnxn (ClientCnxn.java:run(1142)) - Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect
2016-07-27 13:36:54,311 ERROR ha.ActiveStandbyElector (ActiveStandbyElector.java:waitForZKConnectionEvent(1104)) - Connection timed out: couldn't connect to ZooKeeper in 10000 milliseconds
2016-07-27 13:36:54,355 INFO  zookeeper.ZooKeeper (ZooKeeper.java:close(684)) - Session: 0x0 closed
2016-07-27 13:36:54,355 INFO  zookeeper.ClientCnxn (ClientCnxn.java:run(524)) - EventThread shut down
2016-07-27 13:36:54,361 WARN  ha.ActiveStandbyElector (ActiveStandbyElector.java:reEstablishSession(801)) - org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
2016-07-27 13:36:59,362 INFO  zookeeper.ZooKeeper (ZooKeeper.java:<init>(438)) - Initiating client connection, connectString=node3.exp.himalaya.xyz.io:2181,node5.exp.himalaya.xyz.io:2181,node6.exp.himalaya.xyz.io:2181 sessionTimeout=10000 watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@62e5f08b
2016-07-27 13:36:59,363 INFO  zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server node6.exp.himalaya.xyz.io/127.0.0.8:2181. Will not attempt to authenticate using SASL (unknown error)
2016-07-27 13:36:59,364 INFO  zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(864)) - Socket connection established to node6.exp.himalaya.xyz.io/127.0.0.8:2181, initiating session
2016-07-27 13:36:59,364 INFO  zookeeper.ClientCnxn (ClientCnxn.java:run(1142)) - Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect
2016-07-27 13:37:00,276 INFO  zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server node5.exp.himalaya.xyz.io/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2016-07-27 13:37:00,277 INFO  zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(864)) - Socket connection established to node5.exp.himalaya.xyz.io/127.0.0.1:2181, initiating session
2016-07-27 13:37:03,614 INFO  zookeeper.ClientCnxn (ClientCnxn.java:run(1140)) - Client session timed out, have not heard from server in 3337ms for sessionid 0x0, closing socket connection and attempting reconnect
2016-07-27 13:37:03,778 INFO  zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server node3.exp.himalaya.xyz.io/127.0.0.249:2181. Will not attempt to authenticate using SASL (unknown error)
2016-07-27 13:37:03,778 INFO  zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(864)) - Socket connection established to node3.exp.himalaya.xyz.io/127.0.0.249:2181, initiating session
2016-07-27 13:37:03,778 INFO  zookeeper.ClientCnxn (ClientCnxn.java:run(1142)) - Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect
2016-07-27 13:37:05,439 INFO  zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(864)) - Socket connection established to node6.exp.himalaya.xyz.io/127.0.0.8:2181, initiating session
2016-07-27 13:37:05,439 INFO  zookeeper.ClientCnxn (ClientCnxn.java:run(1142)) - Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect
2016-07-27 13:37:06,527 INFO  zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server node5.exp.himalaya.xyz.io/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2016-07-27 13:37:06,528 INFO  zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(864)) - Socket connection established to node5.exp.himalaya.xyz.io/127.0.0.1:2181, initiating session
2016-07-27 13:37:09,363 ERROR ha.ActiveStandbyElector (ActiveStandbyElector.java:waitForZKConnectionEvent(1104)) - Connection timed out: couldn't connect to ZooKeeper in 10000 milliseconds
2016-07-27 13:37:09,962 INFO  zookeeper.ZooKeeper (ZooKeeper.java:close(684)) - Session: 0x0 closed
2016-07-27 13:37:09,962 INFO  zookeeper.ClientCnxn (ClientCnxn.java:run(524)) - EventThread shut down
2016-07-27 13:37:09,962 WARN  ha.ActiveStandbyElector (ActiveStandbyElector.java:reEstablishSession(801)) - org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
2016-07-27 13:37:14,963 INFO  zookeeper.ZooKeeper (ZooKeeper.java:<init>(438)) - Initiating client connection, connectString=node3.exp.himalaya.xyz.io:2181,node5.exp.himalaya.xyz.io:2181,node6.exp.himalaya.xyz.io:2181 sessionTimeout=10000 watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@2dd48f21
2016-07-27 13:37:14,965 INFO  zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server node3.exp.himalaya.xyz.io/127.0.0.249:2181. Will not attempt to authenticate using SASL (unknown error)
2016-07-27 13:37:14,965 INFO  zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(864)) - Socket connection established to node3.exp.himalaya.xyz.io/127.0.0.249:2181, initiating session
2016-07-27 13:37:14,965 INFO  zookeeper.ClientCnxn (ClientCnxn.java:run(1142)) - Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect
2016-07-27 13:37:15,273 INFO  zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server node5.exp.himalaya.xyz.io/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2016-07-27 13:37:15,274 INFO  zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(864)) - Socket connection established to node5.exp.himalaya.xyz.io/127.0.0.1:2181, initiating session
2016-07-27 13:37:18,611 INFO  zookeeper.ClientCnxn (ClientCnxn.java:run(1140)) - Client session timed out, have not heard from server in 3337ms for sessionid 0x0, closing socket connection and attempting reconnect
2016-07-27 13:37:19,184 INFO  zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server node6.exp.himalaya.xyz.io/127.0.0.8:2181. Will not attempt to authenticate using SASL (unknown error)
2016-07-27 13:37:19,185 INFO  zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(864)) - Socket connection established to node6.exp.himalaya.xyz.io/127.0.0.8:2181, initiating session
2016-07-27 13:37:19,186 INFO  zookeeper.ClientCnxn (ClientCnxn.java:run(1142)) - Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect
2016-07-27 13:37:21,181 INFO  zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server node3.exp.himalaya.xyz.io/127.0.0.249:2181. Will not attempt to authenticate using SASL (unknown error)
2016-07-27 13:37:21,181 INFO  zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(864)) - Socket connection established to node3.exp.himalaya.xyz.io/127.0.0.249:2181, initiating session
2016-07-27 13:37:21,181 INFO  zookeeper.ClientCnxn (ClientCnxn.java:run(1142)) - Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect
2016-07-27 13:37:22,081 INFO  zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server node5.exp.himalaya.xyz.io/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2016-07-27 13:37:22,082 INFO  zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(864)) - Socket connection established to node5.exp.himalaya.xyz.io/127.0.0.1:2181, initiating session
2016-07-27 13:37:24,965 ERROR ha.ActiveStandbyElector (ActiveStandbyElector.java:waitForZKConnectionEvent(1104)) - Connection timed out: couldn't connect to ZooKeeper in 10000 milliseconds
2016-07-27 13:37:25,516 INFO  zookeeper.ZooKeeper (ZooKeeper.java:close(684)) - Session: 0x0 closed
2016-07-27 13:37:25,516 INFO  zookeeper.ClientCnxn (ClientCnxn.java:run(524)) - EventThread shut down
2016-07-27 13:37:25,516 WARN  ha.ActiveStandbyElector (ActiveStandbyElector.java:reEstablishSession(801)) - org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
2016-07-27 13:37:30,517 INFO  zookeeper.ZooKeeper (ZooKeeper.java:<init>(438)) - Initiating client connection, connectString=node3.exp.himalaya.xyz.io:2181,node5.exp.himalaya.xyz.io:2181,node6.exp.himalaya.xyz.io:2181 sessionTimeout=10000 watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@339934ae
2016-07-27 13:37:30,518 INFO  zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server node5.exp.himalaya.xyz.io/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2016-07-27 13:37:30,519 INFO  zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(864)) - Socket connection established to node5.exp.himalaya.xyz.io/127.0.0.1:2181, initiating session
2016-07-27 13:37:33,856 INFO  zookeeper.ClientCnxn (ClientCnxn.java:run(1140)) - Client session timed out, have not heard from server in 3337ms for sessionid 0x0, closing socket connection and attempting reconnect
2016-07-27 13:37:34,690 INFO  zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server node3.exp.himalaya.xyz.io/127.0.0.249:2181. Will not attempt to authenticate using SASL (unknown error)
2016-07-27 13:37:34,690 INFO  zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(864)) - Socket connection established to node3.exp.himalaya.xyz.io/127.0.0.249:2181, initiating session
2016-07-27 13:37:34,691 INFO  zookeeper.ClientCnxn (ClientCnxn.java:run(1142)) - Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect
2016-07-27 13:37:35,046 INFO  zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server node6.exp.himalaya.xyz.io/127.0.0.8:2181. Will not attempt to authenticate using SASL (unknown error)
2016-07-27 13:37:35,047 INFO  zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(864)) - Socket connection established to node6.exp.himalaya.xyz.io/127.0.0.8:2181, initiating session
2016-07-27 13:37:35,048 INFO  zookeeper.ClientCnxn (ClientCnxn.java:run(1142)) - Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect
2016-07-27 13:37:36,508 INFO  zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server node5.exp.himalaya.xyz.io/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2016-07-27 13:37:36,509 INFO  zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(864)) - Socket connection established to node5.exp.himalaya.xyz.io/127.0.0.1:2181, initiating session
2016-07-27 13:37:39,844 INFO  zookeeper.ClientCnxn (ClientCnxn.java:run(1140)) - Client session timed out, have not heard from server in 3335ms for sessionid 0x0, closing socket connection and attempting reconnect
2016-07-27 13:37:40,276 INFO  zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server node3.exp.himalaya.xyz.io/127.0.0.249:2181. Will not attempt to authenticate using SASL (unknown error)
2016-07-27 13:37:40,277 INFO  zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(864)) - Socket connection established to node3.exp.himalaya.xyz.io/127.0.0.249:2181, initiating session
2016-07-27 13:37:40,277 INFO  zookeeper.ClientCnxn (ClientCnxn.java:run(1142)) - Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect
2016-07-27 13:37:40,518 ERROR ha.ActiveStandbyElector (ActiveStandbyElector.java:waitForZKConnectionEvent(1104)) - Connection timed out: couldn't connect to ZooKeeper in 10000 milliseconds
2016-07-27 13:37:40,763 INFO  zookeeper.ZooKeeper (ZooKeeper.java:close(684)) - Session: 0x0 closed
2016-07-27 13:37:40,763 WARN  ha.ActiveStandbyElector (ActiveStandbyElector.java:reEstablishSession(801)) - org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
2016-07-27 13:37:45,764 INFO  zookeeper.ZooKeeper (ZooKeeper.java:<init>(438)) - Initiating client connection, connectString=node3.exp.himalaya.xyz.io:2181,node5.exp.himalaya.xyz.io:2181,node6.exp.himalaya.xyz.io:2181 sessionTimeout=10000 watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@fa07419
2016-07-27 13:37:45,766 INFO  zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server node5.exp.himalaya.xyz.io/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2016-07-27 13:37:45,767 INFO  zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(864)) - Socket connection established to node5.exp.himalaya.xyz.io/127.0.0.1:2181, initiating session
2016-07-27 13:37:49,102 INFO  zookeeper.ClientCnxn (ClientCnxn.java:run(1140)) - Client session timed out, have not heard from server in 3335ms for sessionid 0x0, closing socket connection and attempting reconnect
2016-07-27 13:37:49,929 INFO  zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server node6.exp.himalaya.xyz.io/127.0.0.8:2181. Will not attempt to authenticate using SASL (unknown error)
2016-07-27 13:37:49,930 INFO  zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(864)) - Socket connection established to node6.exp.himalaya.xyz.io/127.0.0.8:2181, initiating session
2016-07-27 13:37:49,930 INFO  zookeeper.ClientCnxn (ClientCnxn.java:run(1142)) - Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect
2016-07-27 13:37:50,680 INFO  zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server node3.exp.himalaya.xyz.io/127.0.0.249:2181. Will not attempt to authenticate using SASL (unknown error)
2016-07-27 13:37:50,680 INFO  zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(864)) - Socket connection established to node3.exp.himalaya.xyz.io/127.0.0.249:2181, initiating session
2016-07-27 13:37:50,681 INFO  zookeeper.ClientCnxn (ClientCnxn.java:run(1142)) - Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect
2016-07-27 13:37:52,450 INFO  zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server node5.exp.himalaya.xyz.io/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2016-07-27 13:37:52,451 INFO  zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(864)) - Socket connection established to node5.exp.himalaya.xyz.io/127.0.0.1:2181, initiating session
2016-07-27 13:37:55,765 ERROR ha.ActiveStandbyElector (ActiveStandbyElector.java:waitForZKConnectionEvent(1104)) - Connection timed out: couldn't connect to ZooKeeper in 10000 milliseconds
2016-07-27 13:37:55,885 INFO  zookeeper.ZooKeeper (ZooKeeper.java:close(684)) - Session: 0x0 closed
2016-07-27 13:37:55,885 INFO  zookeeper.ClientCnxn (ClientCnxn.java:run(524)) - EventThread shut down
2016-07-27 13:37:55,886 WARN  ha.ActiveStandbyElector (ActiveStandbyElector.java:reEstablishSession(801)) - org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
2016-07-27 13:38:00,887 INFO  zookeeper.ZooKeeper (ZooKeeper.java:<init>(438)) - Initiating client connection, connectString=node3.exp.himalaya.xyz.io:2181,node5.exp.himalaya.xyz.io:2181,node6.exp.himalaya.xyz.io:2181 sessionTimeout=10000 watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@6e0e2eec
2016-07-27 13:38:00,888 INFO  zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server node5.exp.himalaya.xyz.io/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2016-07-27 13:38:00,889 INFO  zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(864)) - Socket connection established to node5.exp.himalaya.xyz.io/127.0.0.1:2181, initiating session
2016-07-27 13:38:04,226 INFO  zookeeper.ClientCnxn (ClientCnxn.java:run(1140)) - Client session timed out, have not heard from server in 3337ms for sessionid 0x0, closing socket connection and attempting reconnect

Thanks.

7 REPLIES 7
Highlighted

Re: Resource manager was getting stopped automatically

Mentor

@Dinesh E your logs contain server names which may be considered sensitive information. Please edit your post to mask the hostnames.

Highlighted

Re: Resource manager was getting stopped automatically

Explorer

Thanks for letting me know Artem

Highlighted

Re: Resource manager was getting stopped automatically

Contributor

Hi @Dinesh E,

Looks like there isn't anything listening on 2181 for any of your Zk nodes. I just stopped my zookeeper instances and can see a similar message in my yarn logs:

2016-07-28 04:30:08,097 INFO  zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server sebstack0101/X.X.X.X:2181. Will not attempt to authenticate using SASL (unknown error)
2016-07-28 04:30:08,097 WARN  zookeeper.ClientCnxn (ClientCnxn.java:run(1146)) - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect

Could you verify that Zookeeper is running and is listening on port 2181?

Re: Resource manager was getting stopped automatically

Explorer

Hi @Sebastian Carroll, Thanks for you reply.

Zookeeper is running fine and listening at port 2181.

[node3.exp ~]$ sudo netstat -nlp| grep 2181
tcp        0      0 :::2181                     :::*                        LISTEN      23954/java
[node3.exp ~]$ ps 23954
  PID TTY      STAT   TIME COMMAND
23954 ?        Sl     0:42 /usr/lib/jvm/jre/bin/java -Dzookeeper.log.dir=/var/log/zookeeper -Dzookeeper.log.file=zookeeper-zookeeper-server-node3.exp.himalaya.xyz.io.log -Dzookeeper.root.logger=INFO,ROLLIN
[node3.exp ~]$

But my Zookeeper Failover Controller is not running.

Please refer it at:https://community.hortonworks.com/questions/47695/zookeeper-failover-controller-failed-to-start.html...

Thanks

Highlighted

Re: Resource manager was getting stopped automatically

Contributor

The issue with ZKFC looks to be related to this problem. Looks like Zookeeper has bound to an IPv6 address but your other services are trying to connect over IPv4. When I repeat the netstat command, I get:

[root@sebstack0102 ~]# netstat -nlp| grep 2181
tcp        0      0 0.0.0.0:2181   0.0.0.0:*       LISTEN      12573/java

But I have IPv6 disabled on my test cluster so I'm not sure this is the issue as Zookeeper should by default bind in such a way that "any connection to the clientPort for any address/interface/nic on the server will be accepted" (see here under clientPortBindAddress). Looking at your zoo.cfg in your other answer it isn't forcing IPv6 binding. There's a bug listed on the Ubuntu site, that seems to describe this issue but unfortunately they don't really give a resolution apart from potentially permission errors on /var/lib/zookeeper

If you telnet do you connect?

$ telnet 127.0.0.1 2181
Highlighted

Re: Resource manager was getting stopped automatically

Explorer

When tried to connect local host with port no 2181

[node3.exp ~]$ telnet 127.0.0.1 2181
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.

Getting hung up. Getting same issue when using the ip address and port no of node3

Highlighted

Re: Resource manager was getting stopped automatically

Hi Dinesh,

Were you able to solve this issue?