Support Questions

helk · ‎06-19-2024

Hi everyone,

We have a NiFi cluster with 3 nodes that was functioning fine until we encountered the following error. The cluster uses an embedded ZooKeeper for coordination. The error logs indicate issues with connection loss and leadership. Here are the relevant log entries:

2024-06-19 16:25:05,335 WARN [Process Cluster Protocol Request-25] o.a.n.c.p.impl.SocketProtocolListener Failed processing protocol message from nifi01 due to org.apache.nifi.cluster.protocol.ProtocolException: Failed marshalling protocol message in response to message type: HEARTBEAT due to java.net.SocketException: Relais brisé (pipe) (Write failed)

org.apache.nifi.cluster.protocol.ProtocolException: Failed marshalling protocol message in response to message type: HEARTBEAT due to java.net.SocketException: Relais brisé (pipe) (Write failed)

at org.apache.nifi.cluster.protocol.impl.SocketProtocolListener.dispatchRequest(SocketProtocolListener.java:186)

at org.apache.nifi.io.socket.SocketListener$2$1.run(SocketListener.java:131)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

at java.lang.Thread.run(Thread.java:750)

The cluster was operating normally before this issue arose. Now, it appears to be having trouble with leadership roles and communication between nodes.

Questions:

What could be causing this connection this problem?
How can we troubleshoot and resolve this problem to restore normal cluster operations?

helk · ‎06-21-2024

In my case the problem was related to storage issues.

Cloudera Community

Support Questions

NiFi Cluster Error: Connection Loss and Leadership Issues