Created 04-16-2018 09:41 AM
No, I do not have ResourceManager HA enabled.
This is the content of "yarn-yarn-resourcemanager-eureambarislave2.local.eurecat.org.log":
2018-04-15 23:13:33,362 INFO resourcemanager.ResourceManager (LogAdapter.java:info(45)) -
STARTUP_MSG: /************************************************************
STARTUP_MSG: Starting ResourceManager
STARTUP_MSG: user = yarn
STARTUP_MSG: host = eureambarislave2.local.eurecat.org/192.168.0.15 STARTUP_MSG: args = []
STARTUP_MSG: version = 2.7.3.2.6.4.0-91
And this is the content of "yarn-yarn-resourcemanager-eureambarislave2.local.eurecat.org.out":
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. (-i) 64019
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 32768
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 65536
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Created 04-16-2018 09:41 AM
Was that the entire content? Can you upload the file yarn-yarn-resourcemanager-eureambarislave2.local.eurecat.org.log?
Created 04-16-2018 09:54 AM
Sorry, yes, there is more content in LOG file. Please see the error messages below. There seem to be a problem with ZooKeeper, but I do not have any alerts for ZooKeeper.
Zookeeper seems to run well:
# jps -l | grep -i zookeeper
5043 org.apache.zookeeper.server.quorum.QuorumPeerMain
# netstat -anp | grep 2181
tcp6 0 0 :::2181 :::* LISTEN 5043/java
Errors log:
2018-04-16 09:16:23,821 ERROR resourcemanager.ResourceManager (LogAdapter.java:error(69)) - RECEIVED SIGNAL 15: SIGTERM
2018-04-16 09:16:25,315 INFO zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server eureambarislave1.local.eurecat.org/192.168.0.10:2181. Will not attempt to authenticate using SASL (unknown error)
2018-04-16 09:16:25,316 INFO zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(864)) - Socket connection established, initiating session, client: /192.168.0.15:53808, server: eureambarislave1.local.eurecat.org/192.168.0.10:2181
2018-04-16 09:16:25,316 INFO zookeeper.ClientCnxn (ClientCnxn.java:run(1142)) - Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect
2018-04-16 09:16:25,417 INFO recovery.ZKRMStateStore (ZKRMStateStore.java:runWithRetries(1227)) - Exception while executing a ZK operation. org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /rmstore at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783) at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$1.run(ZKRMStateStore.java:326)
Created 04-16-2018 09:13 AM
By the way, I opened the port 8088 as follows:
iptables -I INPUT 1 -p tcp --dport 8088 -j ACCEPT
Is it correct?
Created 04-16-2018 10:13 AM
I solved this problem by opening the ports 2888 and 3888 that are used by ZooKeeper nodes for communicating between each other.