Created 06-05-2018 04:33 AM
Hi Everyone!
I keep getting a: ERROR [Curator-Framework-0] o.a.c.f.imps.CuratorFrameworkImpl Background retry gave up org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = ConnectionLoss
My EDITABLE config files are here in the associated directories in my google drive below and attached, this is a 3 node cluster:
https://drive.google.com/drive/folders/11xM-sz8mUvpaiOOS4aiZ94TQGHHzConF?usp=sharing
I made sure FirewallD was off, all ports used are free, and I followed these guides:
https://docs.hortonworks.com/HDPDocuments/HDF3/HDF-3.1.1/bk_administration/content/clustering.html
AND
Any help would be greatly appreciated!!
Here's my configs in plain text for reference:
###################################################10.0.0.89 Server###################################################### ####ZooKeeper.properties File##### clientPort=24489 initLimit=10 autopurge.purgeInterval=24 syncLimit=5 tickTime=2000 dataDir=./state/zookeeper autopurge.snapRetainCount=30 server.1=10.0.0.89:2888:3888 server.2=10.0.0.227:2888:3888 server.3=10.0.0.228:2888:3888 ####nifi.properties cluster section#### # cluster common properties (all nodes must have same values) # nifi.cluster.protocol.heartbeat.interval=5 sec nifi.cluster.protocol.is.secure=false # cluster node properties (only configure for cluster nodes) # nifi.cluster.is.node=true nifi.cluster.node.address=10.0.0.89 nifi.cluster.node.protocol.port=24489 nifi.cluster.node.protocol.threads=10 nifi.cluster.node.protocol.max.threads=50 nifi.cluster.node.event.history.size=25 nifi.cluster.node.connection.timeout=5 sec nifi.cluster.node.read.timeout=5 sec nifi.cluster.node.max.concurrent.requests=100 nifi.cluster.firewall.file= nifi.cluster.flow.election.max.wait.time=5 mins nifi.cluster.flow.election.max.candidates=3 # zookeeper properties, used for cluster management # nifi.zookeeper.connect.string=10.0.0.89:24489,10.0.0.227:24427,10.0.0.28:24428 nifi.zookeeper.connect.timeout=3 secs nifi.zookeeper.session.timeout=3 secs nifi.zookeeper.root.node=/nifi ###################################################################10.0.0.227 Server################################################################################################ ####ZooKeeper.properties File##### clientPort=24427 initLimit=10 autopurge.purgeInterval=24 syncLimit=5 tickTime=2000 dataDir=./state/zookeeper autopurge.snapRetainCount=30 server.1=10.0.0.89:2888:3888 server.2=10.0.0.227:2888:3888 server.3=10.0.0.228:2888:3888 #####nifi.properties cluster section ################### # cluster common properties (all nodes must have same values) # nifi.cluster.protocol.heartbeat.interval=5 sec nifi.cluster.protocol.is.secure=false # cluster node properties (only configure for cluster nodes) # nifi.cluster.is.node=true nifi.cluster.node.address=10.0.0.227 nifi.cluster.node.protocol.port=24427 nifi.cluster.node.protocol.threads=10 nifi.cluster.node.protocol.max.threads=50 nifi.cluster.node.event.history.size=25 nifi.cluster.node.connection.timeout=5 sec nifi.cluster.node.read.timeout=5 sec nifi.cluster.node.max.concurrent.requests=100 nifi.cluster.firewall.file= nifi.cluster.flow.election.max.wait.time=5 mins nifi.cluster.flow.election.max.candidates=3 # zookeeper properties, used for cluster management # nifi.zookeeper.connect.string=10.0.0.89:24489,10.0.0.227:24427,10.0.0.28:24428 nifi.zookeeper.connect.timeout=3 secs nifi.zookeeper.session.timeout=3 secs nifi.zookeeper.root.node=/nifi ##################################################################################10.0.0.228 Server####### ############################################################################## ####ZooKeeper.properties File##### clientPort=24428 initLimit=10 autopurge.purgeInterval=24 syncLimit=5 tickTime=2000 dataDir=./state/zookeeper autopurge.snapRetainCount=30 server.1=10.0.0.89:2888:3888 server.2=10.0.0.227:2888:3888 server.3=10.0.0.228:2888:3888 #####nifi.properties cluster section ########## # cluster common properties (all nodes must have same values) # nifi.cluster.protocol.heartbeat.interval=5 sec nifi.cluster.protocol.is.secure=false # cluster node properties (only configure for cluster nodes) # nifi.cluster.is.node=true nifi.cluster.node.address=10.0.0.228 nifi.cluster.node.protocol.port=24428 nifi.cluster.node.protocol.threads=10 nifi.cluster.node.protocol.max.threads=50 nifi.cluster.node.event.history.size=25 nifi.cluster.node.connection.timeout=5 sec nifi.cluster.node.read.timeout=5 sec nifi.cluster.node.max.concurrent.requests=100 nifi.cluster.firewall.file= nifi.cluster.flow.election.max.wait.time=5 mins nifi.cluster.flow.election.max.candidates=3 # zookeeper properties, used for cluster management # nifi.zookeeper.connect.string=10.0.0.89:24489,10.0.0.227:24427,10.0.0.28:24428 nifi.zookeeper.connect.timeout=3 secs nifi.zookeeper.session.timeout=3 secs nifi.zookeeper.root.node=/nifi
Created 06-05-2018 06:14 PM
Note: We do not recommend using the embedded ZK in a production environment.
Aside from that connection issues can be expected during any NiFi shutdown/restart because the embedded ZK is shutdown also. Also the default ZK connection and session timeouts are very aggressive for anything more then a basic setup in ideal environment.
-
I recommend changing those to at least 30 secs each.
-
I also se that each of your embedded ZK servers are running in different ports (24489, 24427, and 24428), why? Unusual, but should not be an issue.
Also confirm you created the unique "myid" files in the "./state/zookeeper" directory on each ZK server.
-
Of course any changes to any of NiFi's config files except logback.xml will require a restart for those changes to take affect. Once all nodes are back up and connected to cluster, check to see fi you are still seeing connection issues with ZK.
-
Thank you,
Matt
Created 04-13-2020 09:18 AM
@Aminsh
I am not sure where your response fits in to this thread.
Are you asking a new question here?
I recommend you start a new thread if that is the case.
Thanks,
Matt