Support Questions

cjervis · ‎07-20-2020

NIFI 1.9

HDF 3.4.1

6 node cluster 16 core 64gb memory

5 zookeeper nodes 2 cores 8gb memory

restarted the new cluster, nodes taking very long to join the cluster. unless I bring node by node up, clustering is not happening.

nifi.cluster.node.connection.timeout 120 sec

nifi.cluster.node.max.concurrent.requests 400

nifi.cluster.node.protocol.max.threads 100

nifi.cluster.node.protocol.threads 50

nifi.cluster.node.read.timeout 120s

nifi.zookeeper.connect.timeout 60s

nifi.zookeeper.session.timeout 60s

nifi.cluster.load.balance.comms.timeout 60s

nifi.cluster.node.connection.timeout 120s

nifi.cluster.node.read.timeout 120s

memory 40GB

@MattWho could you advise here why the nodes are not joining the cluster

venkii · ‎07-20-2020

@MattWho 2020-07-20 17:15:58,357 INFO [Process Cluster Protocol Request-2] o.a.n.c.c.node.NodeClusterCoordinator Received Connection Request from qa-nifi-node-blue-02.abc.com:9091; responding with my DataFlow
2020-07-20 17:15:58,388 INFO [Heartbeat Monitor Thread-1] o.a.n.c.c.node.NodeClusterCoordinator Event Reported for qa-nifi-node-blue-02.abc.com:9091 -- Received first heartbeat from connecting node. Node connected.
2020-07-20 17:16:07,332 INFO [Process Cluster Protocol Request-2] o.a.n.c.c.node.NodeClusterCoordinator Status of qa-nifi-node-blue-02.abc.com:9091 changed from NodeConnectionStatus[nodeId=qa-nifi-node-blue-02.abc.com:9091, state=CONNECTED, updateId=21] to NodeConnectionStatus[nodeId=qa-nifi-node-blue-02.abc.com:9091, state=CONNECTING, updateId=22]
2020-07-20 17:16:09,000 WARN [Process Cluster Protocol Request-2] o.a.n.c.p.impl.SocketProtocolListener Failed processing protocol message from ip-10-175-123-222.us-west-2.compute.internal due to org.apache.nifi.cluster.protocol.ProtocolException: Failed marshalling protocol message in response to message type: CONNECTION_REQUEST due to java.net.SocketException: Broken pipe (Write failed)
org.apache.nifi.cluster.protocol.ProtocolException: Failed marshalling protocol message in response to message type: CONNECTION_REQUEST due to java.net.SocketException: Broken pipe (Write failed)
at org.apache.nifi.cluster.protocol.impl.SocketProtocolListener.dispatchRequest(SocketProtocolListener.java:184)
at org.apache.nifi.cluster.protocol.jaxb.JaxbProtocolContext$1.marshal(JaxbProtocolContext.java:86)
at org.apache.nifi.cluster.protocol.impl.SocketProtocolListener.dispatchRequest(SocketProtocolListener.java:182)
2020-07-20 17:16:09,009 INFO [Process Cluster Protocol Request-23] o.a.n.c.p.impl.SocketProtocolListener Finished processing request b20868cb-d4ba-41c4-90ba-07cddda92131 (type=HEARTBEAT, length=3465 bytes) from qa-nifi-node-blue-02.abc.com:9091 in 95 millis
2020-07-20 17:16:41,298 INFO [Process Cluster Protocol Request-24] o.a.n.c.p.impl.SocketProtocolListener Finished processing request 6766e1e4-5181-48aa-9d05-e9c93617afcf (type=CLUSTER_WORKLOAD_REQUEST, length=85 bytes) from ip-10-175-123-222.us-west-2.compute.internal in 133 millis
2020-07-20 17:16:43,489 INFO [Process Cluster Protocol Request-25] o.a.n.c.p.impl.SocketProtocolLis

stopped all nodes, started 1 node [CONNECTED, PRIMARY, COORDINATOR], then started node by node, and cluster came up.

venkii · ‎07-20-2020

this is multi-az AWS cluster 3 nodes on zone 1 and 3 on zone 2

Cloudera Community

Support Questions

HDF nifi 1.9 clustering very slow - taking time to join the cluster

HDFS Balancer (3): Cluster Balancing Algorithm

Nifi node doesn't join the cluster anymore

How to Migrate a Standalone NiFi into a NiFI Clust...

HDF 3.4.1 NIFI 1.9 - NIFI Provenance Repository fi...

Offload NiFi Cluster Nodes using the NiFi Toolkit ...

HDF 2.0: Use Ambari to enable kerberos for HDF clu...

NiFi 1.0.0 - Unsecured cluster setup

Nifi cluster, StandardFlowService

HDFS Recovery Time from Single DataNode Failure

Load balancing in NiFi - Heterogenous Nodes in Clu...