if we start zookeeper from HortonWorks Ambari GUI, it presents that Zookeeper Server is Green and Running, run by zookeeper user, but actually in the log file millions of warnings are triggered such as:
WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
this is big issue because it actually does not work. If we kill the process and start the process manually as root user using the zkServer.sh start command then when we run it as root user it runs without problem and works perfectly.
This is big issue because we must set zookeeper to be run by the zookeeper user. I have checked permissions:
and everything works OK. I really cannot understand why when it is being run as root works OK, while if it is run as zookeeper it does not work. All other Services run by specific users (yarn,hdfs,oozie etc.) works correctly only the issue is with zookeeper.
Please if anyone has some clue. Thanks!!! This is the full log when running with foreground.
[zookeeper@testnn2 bin]$ ./zkServer.sh start-foreground ZooKeeper JMX enabled by default Using config: /usr/hdp/current/zookeeper-server/bin/../conf/zoo.cfg 2019-03-26 17:58:07,869 - INFO [main:QuorumPeerConfig@103] - Reading configuration from: /usr/hdp/current/zookeeper-server/bin/../conf/zoo.cfg 2019-03-26 17:58:07,876 - WARN [main:QuorumPeerConfig@291] - No server failure will be tolerated. You need at least 3 servers. 2019-03-26 17:58:07,876 - INFO [main:QuorumPeerConfig@338] - Defaulting to majority quorums 2019-03-26 17:58:07,883 - INFO [main:DatadirCleanupManager@78] - autopurge.snapRetainCount set to 30 2019-03-26 17:58:07,884 - INFO [main:DatadirCleanupManager@79] - autopurge.purgeInterval set to 24 2019-03-26 17:58:07,890 - INFO [PurgeTask:DatadirCleanupManager$PurgeTask@138] - Purge task started. 2019-03-26 17:58:07,906 - INFO [main:QuorumPeerMain@127] - Starting quorum peer 2019-03-26 17:58:07,908 - INFO [PurgeTask:DatadirCleanupManager$PurgeTask@144] - Purge task completed. 2019-03-26 17:58:07,928 - INFO [main:NIOServerCnxnFactory@94] - binding to port 0.0.0.0/0.0.0.0:2181 2019-03-26 17:58:07,947 - INFO [main:QuorumPeer@992] - tickTime set to 3000 2019-03-26 17:58:07,947 - INFO [main:QuorumPeer@1012] - minSessionTimeout set to -1 2019-03-26 17:58:07,948 - INFO [main:QuorumPeer@1023] - maxSessionTimeout set to -1 2019-03-26 17:58:07,948 - INFO [main:QuorumPeer@1038] - initLimit set to 10 2019-03-26 17:58:08,262 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /192.168.150.220:44292 2019-03-26 17:58:08,272 - INFO [Thread-2:QuorumCnxManager$Listener@506] - My election bind port: testnn2.local/192.168.150.220:3888 2019-03-26 17:58:08,295 - INFO [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:QuorumPeer@747] - LOOKING 2019-03-26 17:58:08,301 - INFO [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@815] - New election. My id = 2, proposed zxid=0x4000015ec 2019-03-26 17:58:08,317 - INFO [WorkerReceiver[myid=2]:FastLeaderElection@597] - Notification: 1 (message format version), 2 (n.leader), 0x4000015ec (n.zxid), 0x1 (n.round), LOOKING (n.state), 2 (n.sid), 0x4 (n.peerEpoch) LOOKING (my state) 2019-03-26 17:58:08,391 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /192.168.150.210:44640 2019-03-26 17:58:08,393 - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
Also suddenly after starting this with forehand option we are not able now to start zookeeper even with root.
Out whole cluster is down.
Please if anyone can assist us to understand why we cannot start zookeeper service.