Support Questions

johnmteabo · ‎06-05-2018

Hi Everyone!

I keep getting a: ERROR [Curator-Framework-0] o.a.c.f.imps.CuratorFrameworkImpl Background retry gave up org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = ConnectionLoss

My EDITABLE config files are here in the associated directories in my google drive below and attached, this is a 3 node cluster:

https://drive.google.com/drive/folders/11xM-sz8mUvpaiOOS4aiZ94TQGHHzConF?usp=sharing

I made sure FirewallD was off, all ports used are free, and I followed these guides:

https://docs.hortonworks.com/HDPDocuments/HDF3/HDF-3.1.1/bk_administration/content/clustering.html

AND

https://community.hortonworks.com/articles/135820/configuring-an-external-zookeeper-to-work-with-apa...

Any help would be greatly appreciated!!

Here's my configs in plain text for reference:

###################################################10.0.0.89 Server###################################################### 
 
####ZooKeeper.properties File##### 


clientPort=24489
initLimit=10
autopurge.purgeInterval=24
syncLimit=5
tickTime=2000
dataDir=./state/zookeeper
autopurge.snapRetainCount=30




server.1=10.0.0.89:2888:3888
 server.2=10.0.0.227:2888:3888
 server.3=10.0.0.228:2888:3888
 


 
 
 ####nifi.properties cluster section####
 
# cluster common properties (all nodes must have same values) #
nifi.cluster.protocol.heartbeat.interval=5 sec
nifi.cluster.protocol.is.secure=false


# cluster node properties (only configure for cluster nodes) #
nifi.cluster.is.node=true
nifi.cluster.node.address=10.0.0.89
nifi.cluster.node.protocol.port=24489
nifi.cluster.node.protocol.threads=10
nifi.cluster.node.protocol.max.threads=50
nifi.cluster.node.event.history.size=25
nifi.cluster.node.connection.timeout=5 sec
nifi.cluster.node.read.timeout=5 sec
nifi.cluster.node.max.concurrent.requests=100
nifi.cluster.firewall.file=
nifi.cluster.flow.election.max.wait.time=5 mins
nifi.cluster.flow.election.max.candidates=3


# zookeeper properties, used for cluster management #
nifi.zookeeper.connect.string=10.0.0.89:24489,10.0.0.227:24427,10.0.0.28:24428
nifi.zookeeper.connect.timeout=3 secs
nifi.zookeeper.session.timeout=3 secs
nifi.zookeeper.root.node=/nifi








 
 
###################################################################10.0.0.227 Server################################################################################################
 
####ZooKeeper.properties File#####  
 
 clientPort=24427
initLimit=10
autopurge.purgeInterval=24
syncLimit=5
tickTime=2000
dataDir=./state/zookeeper
autopurge.snapRetainCount=30




server.1=10.0.0.89:2888:3888
 server.2=10.0.0.227:2888:3888
 server.3=10.0.0.228:2888:3888
 
 
#####nifi.properties cluster section ###################


# cluster common properties (all nodes must have same values) #
nifi.cluster.protocol.heartbeat.interval=5 sec
nifi.cluster.protocol.is.secure=false


# cluster node properties (only configure for cluster nodes) #
nifi.cluster.is.node=true
nifi.cluster.node.address=10.0.0.227
nifi.cluster.node.protocol.port=24427
nifi.cluster.node.protocol.threads=10
nifi.cluster.node.protocol.max.threads=50
nifi.cluster.node.event.history.size=25
nifi.cluster.node.connection.timeout=5 sec
nifi.cluster.node.read.timeout=5 sec
nifi.cluster.node.max.concurrent.requests=100
nifi.cluster.firewall.file=
nifi.cluster.flow.election.max.wait.time=5 mins
nifi.cluster.flow.election.max.candidates=3


# zookeeper properties, used for cluster management #
nifi.zookeeper.connect.string=10.0.0.89:24489,10.0.0.227:24427,10.0.0.28:24428
nifi.zookeeper.connect.timeout=3 secs
nifi.zookeeper.session.timeout=3 secs
nifi.zookeeper.root.node=/nifi
 
 
 
 
 
 
##################################################################################10.0.0.228 Server####### ##############################################################################
 
####ZooKeeper.properties File##### 
 
clientPort=24428
initLimit=10
autopurge.purgeInterval=24
syncLimit=5
tickTime=2000
dataDir=./state/zookeeper
autopurge.snapRetainCount=30




server.1=10.0.0.89:2888:3888
 server.2=10.0.0.227:2888:3888
 server.3=10.0.0.228:2888:3888
 
 
 
 #####nifi.properties cluster section ##########
 
 # cluster common properties (all nodes must have same values) #
nifi.cluster.protocol.heartbeat.interval=5 sec
nifi.cluster.protocol.is.secure=false


# cluster node properties (only configure for cluster nodes) #
nifi.cluster.is.node=true
nifi.cluster.node.address=10.0.0.228
nifi.cluster.node.protocol.port=24428
nifi.cluster.node.protocol.threads=10
nifi.cluster.node.protocol.max.threads=50
nifi.cluster.node.event.history.size=25
nifi.cluster.node.connection.timeout=5 sec
nifi.cluster.node.read.timeout=5 sec
nifi.cluster.node.max.concurrent.requests=100
nifi.cluster.firewall.file=
nifi.cluster.flow.election.max.wait.time=5 mins
nifi.cluster.flow.election.max.candidates=3


# zookeeper properties, used for cluster management #
nifi.zookeeper.connect.string=10.0.0.89:24489,10.0.0.227:24427,10.0.0.28:24428
nifi.zookeeper.connect.timeout=3 secs
nifi.zookeeper.session.timeout=3 secs
nifi.zookeeper.root.node=/nifi

MattWho · ‎06-05-2018

@John T

Note: We do not recommend using the embedded ZK in a production environment.

Aside from that connection issues can be expected during any NiFi shutdown/restart because the embedded ZK is shutdown also. Also the default ZK connection and session timeouts are very aggressive for anything more then a basic setup in ideal environment.
-
I recommend changing those to at least 30 secs each.

-

I also se that each of your embedded ZK servers are running in different ports (24489, 24427, and 24428), why? Unusual, but should not be an issue.

Also confirm you created the unique "myid" files in the "./state/zookeeper" directory on each ZK server.

-

Of course any changes to any of NiFi's config files except logback.xml will require a restart for those changes to take affect. Once all nodes are back up and connected to cluster, check to see fi you are still seeing connection issues with ZK.

-

Thank you,

Matt

View solution in original post

MattWho · ‎04-13-2020

@Aminsh

I am not sure where your response fits in to this thread.

Are you asking a new question here?
I recommend you start a new thread if that is the case.

Thanks,

Matt

Cloudera Community

Support Questions

NiFi Clustering Issue ConnectionLoss Error