Member since
11-04-2025
2
Posts
0
Kudos Received
0
Solutions
11-04-2025
12:35 PM
nifi is hosted in AKS cluster of node count 3. occasionally we are receiving below errors and node-0 always gets disconnected Could you please give some inputs or help possible logs:2025-11-01 01:01:12,444 INFO [Reconnect to Cluster] o.a.n.f.s.StandardVersionedComponentSynchronizer Updated UpdateAttribute[id=9b95428d-2972-38fa-b7cd-20d759b4e749] 2025-11-01 01:01:12,445 INFO [Reconnect to Cluster] o.a.n.f.s.StandardVersionedComponentSynchronizer Updated ExecuteSQLRecord[id=07e8f117-fcdf-39a9-a989-3deb3b467263] 2025-11-01 01:01:12,445 INFO [Reconnect to Cluster] o.a.n.f.s.StandardVersionedComponentSynchronizer Updated UpdateAttribute[id=2be213fa-1965-3303-9e4f-ebbb8e3dec37] 2025-11-01 01:01:12,445 INFO [Reconnect to Cluster] o.a.n.f.s.StandardVersionedComponentSynchronizer Updated StandardProcessGroup[identifier=cd3be53f-130d-3a1c-8e5a-ed86de708fd8,name=fullload-process-configuration] 2025-11-01 01:01:12,445 INFO [Reconnect to Cluster] o.a.n.f.s.StandardVersionedComponentSynchronizer Updated StandardProcessGroup[identifier=36476f9f-018d-1000-ffff-ffff9a57bb5b,name=ifsview-fullload-currentstate] 2025-11-01 01:01:12,445 INFO [Reconnect to Cluster] o.a.n.f.s.StandardVersionedComponentSynchronizer Updated StandardProcessGroup[identifier=059a87b9-2902-326e-9dcf-6f4369cc924e,name=fullload-data-ingestionoperational] 2025-11-01 01:01:12,446 INFO [Reconnect to Cluster] o.a.n.f.s.StandardVersionedComponentSynchronizer Updated StandardProcessGroup[identifier=ca3fa104-6604-32ce-8933-cdc48eaca2e2,name=fullload-data-ingestionanalytics] 2025-11-01 01:01:12,446 INFO [Reconnect to Cluster] o.a.n.f.s.StandardVersionedComponentSynchronizer Updated StandardProcessGroup[identifier=78b00966-4bc8-3525-924f-1605e0bb068b,name=fullload-data-extraction] 2025-11-01 01:01:12,446 INFO [Reconnect to Cluster] o.a.n.f.s.StandardVersionedComponentSynchronizer Updated StandardProcessGroup[identifier=7b0f0ebd-09ad-3fb0-82b7-84b11e68bc23,name=fullload-operational-truncatelandingzonetables] 2025-11-01 01:01:12,447 INFO [Reconnect to Cluster] o.a.n.f.s.StandardVersionedComponentSynchronizer Updated StandardProcessGroup[identifier=45be6018-1c49-3810-9785-7d082054335a,name=fdh-logging-operational] 2025-11-01 01:01:12,447 INFO [Reconnect to Cluster] o.a.n.f.s.StandardVersionedComponentSynchronizer Updated StandardProcessGroup[identifier=681a5022-25b1-30fe-a041-d827aa4785ae,name=fullload-process-configuration] 2025-11-01 01:01:12,447 INFO [Reconnect to Cluster] o.a.n.f.s.StandardVersionedComponentSynchronizer Updated StandardProcessGroup[identifier=bf2be415-018f-1000-0000-00000a313b15,name=fdhoperational-fullload-landingzone] 2025-11-01 01:01:12,448 INFO [Reconnect to Cluster] o.a.n.f.s.StandardVersionedComponentSynchronizer Updated StandardProcessGroup[identifier=d294e6bd-ba08-3a09-a04b-7528ab3745f1,name=deltaprocess-cache-updates] 2025-11-01 01:01:12,448 INFO [Reconnect to Cluster] o.a.n.f.s.StandardVersionedComponentSynchronizer Updated LocalPort[id=bdb77786-6aee-30b2-bfb2-75671f47c5c2, type=OUTPUT_PORT, name=outputportlogging (78ac3d2a-a6d6-3fe2-b9f2-3cdca6956dd7), group=deltaload-data-ingestionanalytics] 2025-11-01 01:01:12,450 INFO [Reconnect to Cluster] o.a.n.c.s.AffectedComponentSet Starting the following components: AffectedComponentSet[inputPorts=[], outputPorts=[], remoteInputPorts=[], remoteOutputPorts=[], processors=[], parameterProviders=[], flowRegistryCliens=[], controllerServices=[], reportingTasks=[]] 2025-11-01 01:01:12,450 INFO [Reconnect to Cluster] o.a.nifi.controller.StandardFlowService Disconnecting node due to Failed to properly handle Reconnection request due to org.apache.nifi.controller.serialization.FlowSynchronizationException: Failed to connect node to cluster because local flow controller partially updated. Administrator should disconnect node and review flow for corruption. 2025-11-01 01:01:12,471 INFO [Reconnect to Cluster] o.apache.nifi.controller.FlowController Will no longer send heartbeats 2025-11-01 01:01:12,471 INFO [Reconnect to Cluster] o.apache.nifi.controller.FlowController FlowController will stop sending heartbeats to Cluster Coordinator 2025-11-01 01:01:12,471 INFO [Reconnect to Cluster] o.apache.nifi.controller.FlowController Cluster State changed from Clustered to Not Clustered 2025-11-01 01:01:12,477 INFO [Reconnect to Cluster] o.a.n.c.l.e.CuratorLeaderElectionManager Cannot unregister Leader Election Role 'Primary Node' because that role is not registered 2025-11-01 01:01:12,477 INFO [Reconnect to Cluster] o.a.n.c.l.e.CuratorLeaderElectionManager Cannot unregister Leader Election Role 'Cluster Coordinator' because that role is not registered 2025-11-01 01:01:12,477 INFO [Reconnect to Cluster] o.a.nifi.controller.StandardFlowService Node disconnected due to Failed to properly handle Reconnection request due to org.apache.nifi.controller.serialization.FlowSynchronizationException: Failed to connect node to cluster because local flow controller partially updated. Administrator should disconnect node and review flow for corruption. 2025-11-01 01:01:12,477 ERROR [Reconnect to Cluster] o.a.nifi.controller.StandardFlowService Handling reconnection request failed due to: org.apache.nifi.controller.serialization.FlowSynchronizationException: Failed to connect node to cluster because local flow controller partially updated. Administrator should disconnect node and review flow for corruption. org.apache.nifi.controller.serialization.FlowSynchronizationException: Failed to connect node to cluster because local flow controller partially updated. Administrator should disconnect node and review flow for corruption.
... View more
Labels:
- Labels:
-
Apache NiFi
11-04-2025
11:43 AM
I have nifi cluster hosted on AKS , its 3 node with memory 20G each node. node has dedicated pvcs standard lrs. We are seeing occasionally once in 1/2 weeks nifi node-0 only disconnects with below error. We are running full load ingestions frequently. but there is no pattern . we run 3-4 processor groups have overlapping time (parallel). but not all the time we get error . most of the runs on most of the day work . sometime we get below mentioned error. what we observed . node-0 first gets disconnected later when it tried to reconnect org.apache.nifi.controller.serialization.FlowSynchronizationException. Any help possible ? Error dump : 2025-11-01 06:05:19,513 WARN [Notify Cluster of Node Status Change-2] o.a.n.c.p.i.StandardClusterCoordinationProtocolSender Failed to send Node Status Change message to nifi-0.nifi-headless.nifi-highresource-lts.svc.cluster.local:8443 org.apache.nifi.cluster.protocol.ProtocolException: Failed to create socket due to: java.net.UnknownHostException: nifi-0.nifi-headless.nifi-highresource-lts.svc.cluster.local at org.apache.nifi.cluster.protocol.impl.StandardClusterCoordinationProtocolSender.createSocket(StandardClusterCoordinationProtocolSender.java:216) at org.apache.nifi.cluster.protocol.impl.StandardClusterCoordinationProtocolSender.createSocket(StandardClusterCoordinationProtocolSender.java:204) at org.apache.nifi.cluster.protocol.impl.StandardClusterCoordinationProtocolSender.access$000(StandardClusterCoordinationProtocolSender.java:67) at org.apache.nifi.cluster.protocol.impl.StandardClusterCoordinationProtocolSender$2.run(StandardClusterCoordinationProtocolSender.java:296) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.base/java.util.concurrent.FutureTask.run(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) Caused by: java.net.UnknownHostException: nifi-0.nifi-headless.nifi-highresource-lts.svc.cluster.local at java.base/java.net.AbstractPlainSocketImpl.connect(Unknown Source) at java.base/java.net.SocksSocketImpl.connect(Unknown Source) at java.base/java.net.Socket.connect(Unknown Source) at java.base/sun.security.ssl.SSLSocketImpl.connect(Unknown Source) at java.base/sun.security.ssl.SSLSocketImpl.<init>(Unknown Source) at java.base/sun.security.ssl.SSLSocketFactoryImpl.createSocket(Unknown Source) at org.apache.nifi.io.socket.SocketUtils.createSocket(SocketUtils.java:68) at org.apache.nifi.cluster.protocol.impl.StandardClusterCoordinationProtocolSender.createSocket(StandardClusterCoordinationProtocolSender.java:210) ... 8 common frames omitted 2025-11-01 06:05:18,438 INFO [Heartbeat Monitor Thread-1] o.a.n.c.c.h.AbstractHeartbeatMonitor Finished processing 2 heartbeats in 20659 nanos 2025-11-01 06:05:18,463 INFO [NiFi Web Server-20322] o.a.n.c.c.node.NodeClusterCoordinator Event Reported for nifi-0.nifi-headless.nifi-highresource-lts.svc.cluster.local:8443 -- Requesting that node connect to cluster on behalf of username 2025-11-01 06:05:18,464 INFO [NiFi Web Server-20322] o.a.n.c.c.node.NodeClusterCoordinator Status of nifi-0.nifi-headless.nifi-highresource-lts.svc.cluster.local:8443 changed from NodeConnectionStatus[nodeId=nifi-0.nifi-headless.nifi-highresource-lts.svc.cluster.local:8443, state=DISCONNECTED, Disconnect Code=Node's Flow did not Match Cluster Flow, Disconnect Reason=org.apache.nifi.controller.serialization.FlowSynchronizationException: Failed to connect node to cluster because local flow controller partially updated. Administrator should disconnect node and review flow for corruption., updateId=44] to NodeConnectionStatus[nodeId=nifi-0.nifi-headless.nifi-highresource-lts.svc.cluster.local:8443, state=CONNECTING, updateId=5] 2025-11-01 06:05:18,492 WARN [Notify Cluster of Node Status Change-2] o.a.n.c.p.i.StandardClusterCoordinationProtocolSender Failed to send Node Status Change message to nifi-0.nifi-headless.nifi-highresource-lts.svc.cluster.local:8443 org.apache.nifi.cluster.protocol.ProtocolException: Failed to create socket due to: java.net.UnknownHostException: nifi-0.nifi-headless.nifi-highresource-lts.svc.cluster.local
... View more
Labels:
- Labels:
-
Apache NiFi