Created on 01-30-2017 08:58 AM - edited 08-18-2019 05:52 AM
Hi, i have been experiencing some issues with my NiFi standalone. It starts with errors relating to leader election and of recent it just throws me out of the system completely. see the error message below;
Created 01-30-2017 01:11 PM
The message you are seeing here indicates that your NiFi instance has been setup as a cluster (possibly a 1 node cluster). NiFi cluster configurations require zookeeper in order handle cluster coordinator elections and cluster state management. If you truly only want a standalone NiFi installation with no dependency on these things, you need to make sure the following property in your nifi.properties file is set to false:
nifi.cluster.is.node=false
Even as a single node NIFi cluster, if zookeeper (internal or external) was setup, the election should complete eventually and the node will then become accessible.
How long it takes for the cluster coordinator to get elected is controlled by the following lines in your nifi.properties file:
nifi.cluster.flow.election.max.wait.time=5 mins nifi.cluster.flow.election.max.candidates=
The above shows the defaults. If the max.candidates is left blank (Normally set to the number of nodes in your NiFi cluster), the election process will take the full 5 minutes to complete before the UI will become available. If max.candidates is set, then election will complete if either all nodes check in or 5 minutes (whichever occurs first).
Thanks,
Matt
Created 01-31-2017 09:05 AM
thanks @Matt
The NiFi cluster is a single node and it has the parameters nifi.cluster.flow.election.max.candidates=1 and nifi.cluster.flow.election.max.wait.time=5mins. However, my issue is the time it takes for the election...we have had to wait a whole weekend and these days it only works after a full restart.
Is there any other way to speed up the election process?
Created 01-31-2017 11:23 AM
Just got the error "An unexpected error has occurred. Action was performed, but no nodes are connected." while working with a flow. The flow simply takes log files and put in HDFS. NiFi went dead at the point of putting the files in HDFS. The file sizes are less than 150MB and there are only 24 of them.
I have restarted the cluster and it's back up...
2017-01-31 12:01:42,731 ERROR [Leader Election Notification Thread-4] o.a.c.f.recipes.leader.LeaderSelector The leader threw an exception java.lang.InterruptedException: null at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326) ~[na:1.8.0_77] at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) ~[na:1.8.0_77] at org.apache.curator.CuratorZookeeperClient.internalBlockUntilConnectedOrTimedOut(CuratorZookeeperClient.java:325) ~[curator-client-2.11.0.jar:na] at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:106) ~[curator-client-2.11.0.jar:na] at org.apache.curator.framework.imps.DeleteBuilderImpl.pathInForeground(DeleteBuilderImpl.java:240) ~[curator-framework-2.11.0.jar:na] at org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:225) ~[curator-framework-2.11.0.jar:na] at org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:35) ~[curator-framework-2.11.0.jar:na] at org.apache.curator.framework.recipes.locks.LockInternals.deleteOurPath(LockInternals.java:339) ~[curator-recipes-2.11.0.jar:na] at org.apache.curator.framework.recipes.locks.LockInternals.releaseLock(LockInternals.java:123) ~[curator-recipes-2.11.0.jar:na] at org.apache.curator.framework.recipes.locks.InterProcessMutex.release(InterProcessMutex.java:154) ~[curator-recipes-2.11.0.jar:na] at org.apache.curator.framework.recipes.leader.LeaderSelector.doWork(LeaderSelector.java:425) [curator-recipes-2.11.0.jar:na] at org.apache.curator.framework.recipes.leader.LeaderSelector.doWorkLoop(LeaderSelector.java:441) [curator-recipes-2.11.0.jar:na] at org.apache.curator.framework.recipes.leader.LeaderSelector.access$100(LeaderSelector.java:64) [curator-recipes-2.11.0.jar:na] at org.apache.curator.framework.recipes.leader.LeaderSelector$2.call(LeaderSelector.java:245) [curator-recipes-2.11.0.jar:na] at org.apache.curator.framework.recipes.leader.LeaderSelector$2.call(LeaderSelector.java:239) [curator-recipes-2.11.0.jar:na] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_77] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_77] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_77] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_77] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_77] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_77] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.8.0_77] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_77] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_77] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77]
Created 01-31-2017 01:47 PM
There is obviously something else going on within your system that is affecting leader election. When you start you NiFi, do you see a leader election/Cluster coordinator count down timer running?
Is your NiFi having trouble talking to your Zookeeper? Looks like you are having timeout issues talking to your zookeeper.
I still don't understand why you are running your NiFi as a 1 node cluster if all you want is a single standalone instance of NiFi. A NiFi configured as a standalone instance does not need zookeeper and also does not perform election of cluster coordinator or primary node. Setting the following property in your nifi.properties and restarting will make you NiFi a truly standalone instance:
nifi.cluster.is.node=false
Matt