Support Questions

Find answers, ask questions, and share your expertise

nifi 1.11.4 node down

avatar
Contributor

Good morning to the community, I would like to raise a problem that I am having with a nifi cluster that I manage version 1.11.4, during the night a nifi node goes down, every day for about a month now, when we were developing some contingency processes it started to fail with the fall of a node, although it is true that I am clear about which processes it is, I need to understand how I can repair the problem from the process front and from the nifi front, attached is the nifi app_user.log log and the zookeeper log, to see if you can help me.

log user_app.log

Bern_1-1761311368851.png

log zookeeper

Bern_0-1761311303130.png

I look forward to any support. Greetings to all.



1 REPLY 1

avatar
Master Mentor

@Bern 

From what has been shared I can't tell you much more then the fact that you are having zookeeper issues.  Without a ZK quorum, NiFi can't maintain the cluster and thus the UI is not available.  There is not much else I can tell you from what you have shared.  I suggest looking more closely at the NiFi app.log and bootstrap logs for other ERRORs or WARNs that may have been logged before or at same time your ZK communication issue started.  You may also want to look at the health of your network.

You mention that NiFi goes down nightly and that this started when you were building some "contingency processes".  I am not clear on what this means relative to your NiFi. Do you mean you were adding additional flows to your NiFi canvas when you started having issues?

What else are you seeing in the NiFi logs prior to the ZK issue starting?
What about CPU load?
What about memory: Any Out Of Memory (EoM), continues garbage collection or long garbage collection events?
Network issues resolving hostnames of the ZK nodes?
Issues telnetting to ZK nodes using the zk configured ports (2888 and 3888)?

I also appears you are using the embedded ZK which I would strongly discourage in any production system.  Any resource congestion issues your NiFi experience is going to have impacts on ZK quorum and cluster stability. I recommend standing up an external ZK on different hosts from where your NiFi is installed. It also worth noting that Apache NiFi 1.14.4 was released more then 6 years ago.  There have been many improvements, bug fixes, and CVEs addressed since then. Upgrading to the latest 1.28 (keeping in the same Major release 1.x branch) would be very advisable.  There will be No more Apache NiFi 1.x versions released now that Apache NiFi 2.x major release branch is available.


Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped.

Thank you,
Matt