Member since
07-05-2019
6
Posts
0
Kudos Received
0
Solutions
08-13-2019
03:55 PM
Wow thanks @Geoffrey Shelton Okot and sorry for the late response. Changing the maximum memory value did the job. Now we're checking that it stays stable. So far so good!
... View more
07-31-2019
07:21 PM
After checking today's log file I found this. Will google it to see what it means 2019-07-31 07:57:58,187 - INFO [main:QuorumPeerConfig@103] - Reading configuration from: /usr/hdp/current/zookeeper-server/conf/zoo.cfg 2019-07-31 07:57:58,191 - WARN [main:QuorumPeerConfig@291] - No server failure will be tolerated. You need at least 3 servers. 2019-07-31 07:57:58,191 - INFO [main:QuorumPeerConfig@338] - Defaulting to majority quorums 2019-07-31 07:57:58,196 - INFO [main:DatadirCleanupManager@78] - autopurge.snapRetainCount set to 30 2019-07-31 07:57:58,197 - INFO [main:DatadirCleanupManager@79] - autopurge.purgeInterval set to 24 2019-07-31 07:57:58,198 - INFO [PurgeTask:DatadirCleanupManager$PurgeTask@138] - Purge task started. 2019-07-31 07:57:58,210 - INFO [main:QuorumPeerMain@127] - Starting quorum peer 2019-07-31 07:57:58,219 - INFO [PurgeTask:DatadirCleanupManager$PurgeTask@144] - Purge task completed. 2019-07-31 07:57:58,223 - INFO [main:NIOServerCnxnFactory@94] - binding to port 0.0.0.0/0.0.0.0:2181 2019-07-31 07:57:58,233 - INFO [main:QuorumPeer@992] - tickTime set to 2000 2019-07-31 07:57:58,233 - INFO [main:QuorumPeer@1012] - minSessionTimeout set to -1 2019-07-31 07:57:58,233 - INFO [main:QuorumPeer@1023] - maxSessionTimeout set to -1 2019-07-31 07:57:58,233 - INFO [main:QuorumPeer@1038] - initLimit set to 10 2019-07-31 07:57:58,245 - INFO [main:FileSnap@83] - Reading snapshot /hadoop/zookeeper/version-2/snapshot.8600bc40ab 2019-07-31 07:58:41,800 - ERROR [main:NIOServerCnxnFactory$1@44] - Thread Thread[main,5,main] died java.lang.OutOfMemoryError: GC overhead limit exceeded at org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:97) at org.apache.zookeeper.server.DataNode.deserialize(DataNode.java:158) at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103) at org.apache.zookeeper.server.DataTree.deserialize(DataTree.java:1194) at org.apache.zookeeper.server.util.SerializeUtils.deserializeSnapshot(SerializeUtils.java:127) at org.apache.zookeeper.server.persistence.FileSnap.deserialize(FileSnap.java:127) at org.apache.zookeeper.server.persistence.FileSnap.deserialize(FileSnap.java:87) at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:130) at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223) at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:483) at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:473) at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:153) at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:111) at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
... View more
07-31-2019
07:21 PM
Thanks @Geoffrey Shelton Okot for your help! I did restart all services manually but seems that ZK still fails. From the screenshot I posted, one of my ZK servers is always down. Since ZK needs to be up and running before anything else, I'd like to fix this issue before anything else. I checked the error message from Ambari and it says The error message says Connection failed: [Errno 111] Connection refused to ip_zookeeper_server2:2181 What else can I do to fix this? update: This is what the log file shows on that machine 2019-07-31 07:57:58,187 - INFO [main:QuorumPeerConfig@103] - Reading configuration from: /usr/hdp/current/zookeeper-server/conf/zoo.cfg 2019-07-31 07:57:58,191 - WARN [main:QuorumPeerConfig@291] - No server failure will be tolerated. You need at least 3 servers. 2019-07-31 07:57:58,191 - INFO [main:QuorumPeerConfig@338] - Defaulting to majority quorums 2019-07-31 07:57:58,196 - INFO [main:DatadirCleanupManager@78] - autopurge.snapRetainCount set to 30 2019-07-31 07:57:58,197 - INFO [main:DatadirCleanupManager@79] - autopurge.purgeInterval set to 24 2019-07-31 07:57:58,198 - INFO [PurgeTask:DatadirCleanupManager$PurgeTask@138] - Purge task started. 2019-07-31 07:57:58,210 - INFO [main:QuorumPeerMain@127] - Starting quorum peer 2019-07-31 07:57:58,219 - INFO [PurgeTask:DatadirCleanupManager$PurgeTask@144] - Purge task completed. 2019-07-31 07:57:58,223 - INFO [main:NIOServerCnxnFactory@94] - binding to port 0.0.0.0/0.0.0.0:2181 2019-07-31 07:57:58,233 - INFO [main:QuorumPeer@992] - tickTime set to 2000 2019-07-31 07:57:58,233 - INFO [main:QuorumPeer@1012] - minSessionTimeout set to -1 2019-07-31 07:57:58,233 - INFO [main:QuorumPeer@1023] - maxSessionTimeout set to -1 2019-07-31 07:57:58,233 - INFO [main:QuorumPeer@1038] - initLimit set to 10 2019-07-31 07:57:58,245 - INFO [main:FileSnap@83] - Reading snapshot /hadoop/zookeeper/version-2/snapshot.8600bc40ab 2019-07-31 07:58:41,800 - ERROR [main:NIOServerCnxnFactory$1@44] - Thread Thread[main,5,main] died java.lang.OutOfMemoryError: GC overhead limit exceeded at org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:97) at org.apache.zookeeper.server.DataNode.deserialize(DataNode.java:158) at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103) at org.apache.zookeeper.server.DataTree.deserialize(DataTree.java:1194) at org.apache.zookeeper.server.util.SerializeUtils.deserializeSnapshot(SerializeUtils.java:127) at org.apache.zookeeper.server.persistence.FileSnap.deserialize(FileSnap.java:127) at org.apache.zookeeper.server.persistence.FileSnap.deserialize(FileSnap.java:87) at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:130) at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223) at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:483) at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:473) at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:153) at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:111) at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
... View more
07-31-2019
05:32 PM
Thank again @Geoffrey Shelton Okot I have 4 machines in the cluster (1 master 3 slaves) Zookeeper server is installed in the master (works fine) and in one slave (fails) This is the log I get from the slave zookeeperLOG.txt
... View more
07-30-2019
08:05 PM
Good morning guys, thanks in advance for your help! I have a project that fails. I'm trying to restart all the services manually but havent been able to. I have a few questions and I'd really appreciate if you can give me some guidance because at this moment I'm kinda stuck. 1. How do I check what services need to be "up and running" before restarting the next one? Is there any place where I can see the dependency? 2. Do I need 2 ZooKeeper servers up and running? The first one is running in localhost but the 2nd one runs in a different machine. If I actually need them both, how can I check what was wrong in the second one?
... View more
Labels:
- Labels:
-
Apache Ambari