Support Questions

rafy · ‎06-16-2022

I was trying to setup Nifi cluster.

All these hosts pings one another.

I have tried everything i could on the web, but no success.

This is my configuration for master node:

Thanks.

MASTERNODE:

nifi.state.management.embedded.zookeeper.start true
nifi.zookeeper.connect.string =masternode:2181,workernode02:2181,workernode03:2181

nifi.cluster.is.node=true
nifi.cluster.node.address=masternode
nifi.cluster.node.protocol.port=9991
nifi.cluster.node.protocol.threads=10
nifi.cluster.node.event.history.size=25
nifi.cluster.node.connection.timeout=5 sec
nifi.cluster.node.read.timeout=5 sec
nifi.cluster.firewall.file=
nifi.cluster.protocol.is.secure=false
nifi.cluster.load.balance.host=masternode
nifi.cluster.load.balance.port=6342

nifi.remote.input.host=masternode
nifi.remote.input.secure=false
nifi.remote.input.socket.port=10000
nifi.remote.input.http.enabled=true
nifi.remote.input.http.transaction.ttl=30 sec

nifi.web.http.host=masternode
nifi.web.http.port=9443

------------------------------------------------------

WORKERNODE 02:

nifi.state.management.embedded.zookeeper.start=true
nifi.zookeeper.connect.string=masternode:2181,workernode02:2181,workernode03:2181
nifi.cluster.is.node=true
nifi.cluster.node.address=workernode02
nifi.cluster.node.protocol.port=9991
nifi.cluster.node.protocol.threads=10
nifi.cluster.node.event.history.size=25
nifi.cluster.node.connection.timeout=5 sec
nifi.cluster.node.read.timeout=5 sec
nifi.cluster.firewall.file =
nifi.cluster.protocol.is.secure=false
nifi.cluster.load.balance.host=workernode02
nifi.cluster.load.balance.port=6342
nifi.remote.input.host=workernode02
nifi.remote.input.secure=false
nifi.remote.input.socket.port=10000
nifi.remote.input.http.enabled=true
nifi.remote.input.http.transaction.ttl=30 sec
nifi.web.http.host=workernode02
nifi.web.http.port=9443

-----------------------------------------

WORKERNODE 03: same as above except the host.

------------------------------------------------------------

state-maanagement.xml:

<cluster-provider>
<id>zk-provider</id>
<class>org.apache.nifi.controller.state.providers.zookeeper.ZooKeeperStateProvider</class>
<property name="Connect String">masternode:2181,workernode02:2181,workernode03:2181</property>
<property name="Root Node">/nifi</property>
<property name="Session Timeout">10 seconds</property>
<property name="Access Control">Open</property>
</cluster-provider>

-------------------------------------------------------

zookeeper.properties:

server.1=masternode:2888:3888;2181
server.2=workernode02:2888:3888;2181
server.3=workernode03:2888:3888;2181

------------------------------------------

ERROR:

2022-06-16 11:50:03,954 WARN [main] o.a.nifi.controller.StandardFlowService There is currently no Cluster Coordinator. This often happens upon restart of NiFi when running an embedded ZooKeeper. Will register this node to become the active Cluster Coordinator and will attempt to connect to cluster again
2022-06-16 11:50:03,954 INFO [main] o.a.n.c.l.e.CuratorLeaderElectionManager CuratorLeaderElectionManager[stopped=false] Attempted to register Leader Election for role 'Cluster Coordinator' but this role is already registered
2022-06-16 11:50:05,119 WARN [Heartbeat Monitor Thread-1] o.a.n.c.l.e.CuratorLeaderElectionManager Unable to determine leader for role 'Cluster Coordinator'; returning null
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /nifi/leaders/Cluster Coordinator
at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:2707)
at org.apache.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChildrenBuilderImpl.java:242)
at org.apache.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChildrenBuilderImpl.java:231)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93)
at org.apache.curator.framework.imps.GetChildrenBuilderImpl.pathInForeground(GetChildrenBuilderImpl.java:228)
at org.apache.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChildrenBuilderImpl.java:219)
at org.apache.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChildrenBuilderImpl.java:41)
at org.apache.curator.framework.recipes.locks.LockInternals.getSortedChildren(LockInternals.java:154)
at org.apache.curator.framework.recipes.locks.LockInternals.getParticipantNodes(LockInternals.java:134)
at org.apache.curator.framework.recipes.locks.InterProcessMutex.getParticipantNodes(InterProcessMutex.java:170)
at org.apache.curator.framework.recipes.leader.LeaderSelector.getLeader(LeaderSelector.java:337)
at org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager.getLeader(CuratorLeaderElectionManager.java:281)
at org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$ElectionListener.verifyLeader(CuratorLeaderElectionManager.java:571)
at org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$ElectionListener.isLeader(CuratorLeaderElectionManager.java:525)
at org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$LeaderRole.isLeader(CuratorLeaderElectionManager.java:466)
at org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager.isLeader(CuratorLeaderElectionManager.java:262)
at org.apache.nifi.cluster.coordination.node.NodeClusterCoordinator.isActiveClusterCoordinator(NodeClusterCoordinator.java:824)
at org.apache.nifi.cluster.coordination.heartbeat.AbstractHeartbeatMonitor.monitorHeartbeats(AbstractHeartbeatMonitor.java:132)
at org.apache.nifi.cluster.coordination.heartbeat.AbstractHeartbeatMonitor$1.run(AbstractHeartbeatMonitor.java:84)
at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
2022-06-16 11:50:11,732 INFO [Cleanup Archive for default] o.a.n.c.repository.FileSystemRepository Successfully deleted 0 files (0 bytes) from archive
2022-06-16 11:50:11,733 INFO [Cleanup Archive for default] o.a.n.c.repository.FileSystemRepository Archive cleanup completed for container default; will now allow writing to this container. Bytes used = 10.06 GB, bytes free = 7.63 GB, capacity = 17.7 GB

rafy · ‎06-16-2022

Thanks to everyone especially @SAMSAL.

I got to know it is the firewall that was blocking the communications among the ports.

I tried to add those ports to firewall but no much success. When i tried to reach NIFI GUI, i ran into this error: java.net.NoRouteToHostException: No route to host (Host unreachable).

So i disable the firewall & everything worked perfectly.

FIRST METHOD: NO SUCCESS

firewall-cmd --zone=public --permanent --add-port 9991/tcp
firewall-cmd --zone=public --permanent --add-port 2888/tcp
firewall-cmd --zone=public --permanent --add-port 3888/tcp
firewall-cmd --zone=public --permanent --add-port 2181/tcp

firewall-cmd --reload

SECOND METHOD: SUCCESS

systemctl stop firewalld

View solution in original post

SAMSAL · ‎06-16-2022

Its hard to tell what is causing the error , the only thing for me is just verifying that you followed the steps accordingly to setup the cluster. One thing you did not mentioned is creating the myid file under the folder ./state/zookeeper on each node that defines the node number as mentioned in the url I provided before, did you do that?

rafy · ‎06-16-2022

I did it correctly.

I have cross checked many times.

I made sure the digits are inside myid file appropriately.

SAMSAL · ‎06-16-2022

Just to be on the safe side, can you define the nodes in the authorizers.xml as follows. Keep in mind when you define something in the nifi config files its is case sensitive.

CN=masternode, OU=NIFI

CN=workernode02, OU=NIFI

CN=workernode03, OU=NIFI

rafy · ‎06-16-2022

Also, i forgot to tell you that i am running three (3) CentOs 8.5 instances separately on Vmware configured with static IP.

All of them are able to ping one another. And i also disable Selinux.

@araujo · ‎06-16-2022

@MattWho,@araujo , can you guys help?

rafy · ‎06-16-2022

Thanks sir, @SAMSAL for calling for help on my behalf.

SAMSAL · ‎06-16-2022

Sure, I hope they can help. By the have you seen my comments above about defining the nodes in the auotherizers.xml with the following format (case sensitive):

CN=masternode, OU=NIFI

CN=workernode02, OU=NIFI

CN=workernode03, OU=NIFI

rafy · ‎06-16-2022

Sir, i have rectified accordingly.

Still no success.

Thanks

SAMSAL · ‎06-16-2022

By the way what is inside the users.xml file?

rafy · ‎06-16-2022

I have no such file inside my conf folder.

I am running a non secured instance.

I am on NIFI 1.16.2, jdk 11.0.15.

Thank you.

Cloudera Community

Support Questions

Error Nifi Cluster Setup: Attempted to register Leader Election for role 'Cluster Coordinator' but this role is already registered