Member since
07-30-2019
3249
Posts
1590
Kudos Received
953
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
108 | 05-02-2025 07:25 AM | |
245 | 04-25-2025 05:54 AM | |
228 | 04-25-2025 05:44 AM | |
270 | 04-24-2025 07:38 AM | |
281 | 04-24-2025 06:36 AM |
05-02-2025
07:25 AM
@shiva239 There are numerous things happening when a node is disconnected. A disconnected node is different then a dropped node. A cluster node must be disconnected before it can be dropped. A node can become disconnected in two ways: Manually disconnected - User manually disconnects a node via the NiFi Cluster UI. A manually disconnected node will not attempt to auto-rejoin cluster. A user can manually reconnect the node from another node in the cluster via the same Cluster UI. A node becomes disconnected due to some issue. A node that is disconnected is still part of the cluster until it is dropped. Once dropped the cluster no longer considers it part of the cluster. This distinction matter when it comes to load balanced connections that use a load balance strategy other then Round Robin. Load balance connections use NiFi Site-To-Site protocol to move FlowFiles between nodes. Only Connected nodes are eligible to have FlowFiles sent over Site-To-Site. Even a disconnected node is still able to load-balance FlowFiles to other nodes still connected in the cluster. (so when you had 1 node disconnect from cluster, if you went to that nodes UI directly the load balanced connection would appear to processing all FlowFiles normally. This is because the two nodes where it sends some FlowFiles by attribute are still connected and thus it is allowed to send to them. The other FlowFiles by attribute destined for disconnected node never leave the node and get processed locally. Over on the still cluster connected nodes the story is different. They can only send to connected nodes and any FlowFiles destined for that disconnected node will begin to queue. Even if you stopped the dataflows on the disconnected node the FlowFiles would continue to queue for that node. So stopping the dataflow on a node that disconnects would still present same issue. A disconnected node is still aware of what node are part of the cluster and can still communicate with ZK to know which node is the elected cluster coordinator. Lets say a second node disconnects. The disconnected node would stop attempting to send to that now disconnected node and queue FlowFiles destined for that node. Only the round robin strategy will attempt to redistribute FlowFile to remaining connected nodes when a node becomes disconnected. The Partition by attribute and Single Node strategies are used when it is important that "like" FlowFiles end up on the same node for downstream processing (So once a like FlowFile, which in your case are FlowFiles with same value in the orderid FlowFile attribute, is marked for node 3, all FlowFiles with that same orderId will queue for node 3 as long as node three is still a member of the cluster. A disconnected node is still part of the cluster and will have some "like" FlowFiles already on it, so we would not want NiFi to start sending "Like" data to some other node all of a sudden. Now if manual user action was taken to drop the disconnected node, then the load-balanced connections would start using a different node for the FlowFiles original being allocated to the disconnected node. NiFi also offers an off-loading feature. This allows a user with proper authorization to off-load a disconnected node (IMPORTANT: only a reachable and running node can be offloaded successfully. Attempting Offloading of a down or unreachable node will not work). Once a node is disconnected a user can choose to offload the node this is typical if say a user want to decommission a node in the cluster. Initiating off-load will send a request to that disconnected node to stop, terminate all running components and then off load the queued FlowFiles to other nodes connected to the cluster. If cluster nodes where allowed to continue to load-balance to disconnected node(s), this capability would fail as you would end-up with a constant loop of FlowFiles back to disconnected node. Once offloading completes that disconnected node could be dropped and the FlowFiles that were offload would get load balanced to remaining nodes still members of the cluster. I think I covered all the basic behind the scenes functionality of load-balanced connection with regards to disconnected node behaviors. In your scenario, your node, the node became disconnected due to some issue when changing the version of a version controlled process group. I would recommend a new community question if you need help with that issue as it has no direct relationship with how load-balance connection function or disconnected nodes still running discussed here. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
04-28-2025
05:41 AM
1 Kudo
@shiva239 Is this expected behavior for NiFi node to be active although it is not in cluster Yes: A disconnected node that is still running will continue to run its enabled and running NiFi components processing the existing FlowFiles on that specific node and ingesting new data as well. The node will still be aware that it is a node that belongs to a cluster, so components will still utilize Zookeeper for any cluster stored state data (read and update). It is simply no longer connected but all functionality persists. What i can not do while disconnect is receive and configuration changes that the elected cluster coordinator is replicating to all nodes currently part of the cluster. From a node in the cluster, you should be able to go to the cluster UI and look at the node that is marked as disconnected to see the recorded reason or disconnection (such as lack of heartbeat). A node that disconnects not as the result of user manual action, should automatically attempt to reconnect as it will still attempt to send heartbeats to the elected cluster coordinator reporting from Zookeeper. When the cluster coordinator receives one of these heartbeats from the disconnected node, it will initiate a node reconnection. During this reconnection the nodes dataflow (flow.json) is compared with the cluster's current dataflow. In order for the node to rejoin its local flow must match cluster flow. If it does not, the node will attempt to inherit the cluster flow. It inheritance of the cluster flow is not possible, this will be logged with reason (one common reason is cluster flow no longer has a connection that the local flow still has which contain FlowFiles. NiFi will not inherit a flow that would result in dataloss on the local node). Can we alter the behavior through configuration to stop processors temporarily while node is not connected to cluster? NiFi has no option to stop processors on a disconnected. Not clear on the use case why you would want to do this? The expectation is that an unexpected disconnection (commonly due to lack of heartbeat) would auto reconnect once heartbeats resume to cluster coordinator. Plus a disconnected node does not mean loss of functionality in the disconnected node. The disconnected node can still execute its dataflow just as it was while connected. While all nodes in the cluster keep their dataflows in-sync and use zookeeper for any cluster state sharing, they all execute base do their local copy of the flow.json and execute their own node specific set of FlowFiles. This continues even when a node is disconnected because that node still knows it was part of a cluster. I find this comment interesting: "Furthermore, the affected node did not attempt to reconnect to the cluster on its own." Did you check the reason recorded for why this node disconnected? (Did a user manually disconnect the node? or was it disconnected for another reason) Did you inspect the logs on the disconnected node and the elected cluster coordinator around the time of the disconnection? Do you see disconnected node logging any issue communicating with Zookeeper? Do you see disconnected node attempting to send heartbeats to the currently elected cluster coordinator? Is the current cluster coordinator logging receiving these heartbeats? Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
04-25-2025
06:47 AM
@Shrink So I see you are trying to start a Process Group (which starts all the NiFi components within that process group). You are not setup with a production ready certificate nor production ready authentication and authorization configuration which makes setting up the necessary authorizations not possible. You would need to switch to using the managed authorizer which allow you to use the file-user-group-provider. This provider will allow you to define your NiFi node certificate DN as a user which you can then authorize as needed to make the rest-api call you want to make. Have you looked at using FlowFile Concurrency and Outbound Policy options available within the process group configuration to control the input and output of FlowFiles in and out of each process group? These settings would allow you to control the movement of FlowFiles from one PG to another and achieve I believe what you are trying to do with needing to programmatically start and stop Process groups via rest-api calls. Configuring a Process Group FlowFile Concurrency OutBound Policy Using rest-api calls first requires you to constantly check to make sure one PG is done processing all FlowFiles before you start the next. Not efficient design. You should try to design your dataflows so they are always running. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
04-25-2025
06:33 AM
@hegdemahendra In Apache NiFi 1.x versions the nifi.sh dump capability still exists, so you can still use that. ./nifi.sh dump /tmp/<dump filename> NOTE: dump is no longer an option in Apache NiFi 2.x versions But since NiFi is a Java application you can also take a java thread dump: jmap -dump:format=b,file=heap.hprof <nifi pid> Keep in mind that NiFi will have two process running (bootstrap process and main process). Make sure you are using the main NiFi pid and not the pid for the bootstrap process. Newer version of 1.x also include a diagnostics option. When in verbose, it will also output the thread dump in the diagnostics output. ./nifi.sh diagnostics --verbose /tmp/diag.txt Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
04-25-2025
06:09 AM
@Bern There is not enough information provided to make any suggestions yet. Apache NiFi 1.16 is still an almost 5 year old version. I'd encourage you to use the latest 1.x release available to make sure you have all the latest bug fixes and security fixes. The log line shared is incomplete and is also just an "INFO" level log message which is not going to tell us anything about and issue. I'd suggest looking at the logs on all the nodes for ERROR log messages. Re-check all yoru nifi.properties file configuration (hopefully you did NOT copy the config files from your 1.11 to 1.16 but rather used your 1.11 to set the appropriate configurations in the newer config files of 1.16. Beyond the above you'll need to inspect the a serous of Nifi thread dumps to see if the "main" thread is progressing (changing in each dump) or always has the same dump output. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
04-25-2025
05:54 AM
@Shrink What do you mean by "by storing flow file temporally to save RAM or backpressure"? FlowFiles held in NiFi connections will consume NiFi heap memory (unless queue has gotten very large resulting some of those queued FlowFiles being swapped to disk). But this behavior is no different if you use process groups or not. Process groups do allow you to configure: "Process Group FlowFile Concurrency" "Process Group Outbound Policy" I assume you are using the above to control the FlowFile going in and out of your process groups as those FlowFiles move from one process group to the next? Using these allows you to insure processing in one PG completes before the outbound FlowFiles are released to the next downstream process group. This also allows you to leave all your processor in a running state for more efficient /performant dataflow. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
04-25-2025
05:44 AM
@Alf015 The "merge-param-context" cli.sh NiFi command does the following when merging one parameter context into another: Adds any parameters that exist in the exported context that don't exist in the
existing context. So your observations align with the documented behavior of this command. It will not modify value of any already existing parameter in the destination parameter context. It sounds like what you want to be using in the "set-param" command instead: Creates or updates a parameter in the given parameter context. Above will allow you to modify an existing parameter within the target parameter context. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
04-24-2025
01:13 PM
@hegdemahendra My initial guess would be a long running or hung thread on processor preventing it from transitioning from "stopping" to "stopped" state required. Including your full Apache NiFi version is aways helpful when posting a question. Allows someone helping to check easier if there happens to be any known bugs in that release that may be related to the issue. Analyzing NiFi thread dumps might help narrowing down what is specifically being waited on. NOTE: The Apache NiFi 1.x version are going end of life and Apache NIFi 2.x no longer supports variable registry. NiFi Parameters also available in the newer Apache NiFi 1.x versions are the replacement. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
04-24-2025
07:38 AM
@Alf015 The merge-param-context command will merge an exported parameter context's parameters into another existing parameter context. As simple exported parameter context with three parameters defined in it looks like this: {
"name" : "test-param",
"description" : "",
"parameters" : [ {
"parameter" : {
"name" : "test1",
"description" : "",
"sensitive" : false,
"value" : "test1",
"provided" : false,
"inherited" : false
}
}, {
"parameter" : {
"name" : "test2",
"description" : "",
"sensitive" : false,
"value" : "test2",
"provided" : false,
"inherited" : false
}
}, {
"parameter" : {
"name" : "test3",
"description" : "",
"sensitive" : false,
"value" : "test3",
"provided" : false,
"inherited" : false
}
} ],
"inheritedParameterContexts" : [ ]
} You can then use the merge-param-context command to merge the exported parameter context with another parameter context that already exists in the target NiFi: ./cli.sh nifi merge-param-context -i /tmp/test-param.json -p /opt/nifi-toolkit/conf/mytarget.properties -pcid b47a13ab-0195-1000-cc06-dc7779729310 --verbose "test-param.json" contains above shared json "mytarget.properties" contains the connection info for target NiFi. "b47a13ab-0195-1000-cc06-dc7779729310" is the id for the destination parameter context in which I am merging the input parameters There is a sample .properties file (../conf/cli.properties.example) in the toolkit conf directory. Contents of mine looks like this: baseUrl=https://nifihostname:8443
keystore=/opt/nifi/conf/keystore.p12
keystoreType=PKCS12
keystorePasswd=<password>
keyPasswd=<password>
truststore=/opt/nifi/conf/truststore.p12
truststoreType=PKCS12
truststorePasswd=<password>
proxiedEntity=<nifiadmin> You can get the values for above from your nifi.properties file of the target/destination NiFi. The proxiedEntity is my NiFi user identity that has permissions to edit the destination parameter context I am importing the new parameters into. NOTE: Your NiFi node must be authorized to /proxy user requests with Access Policies in destination NiFi. If you choose not to use a proxiedEntity, your NiFi node will need to be directly authorized to edit the target parameter context. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
04-24-2025
06:36 AM
@0tto Using child process groups is a matter of your own personal preference. Using child Process Groups allows you create a more manageable NiFi canvas by placing unique dataflows in different Process Groups. When it comes to one continuous dataflow, you may choose to put portions of it in to child process groups. For example, you might do this if portions of the dataflow can be reusable. You can right click on a process group and download a flow definition or you can choose to version control a process group to NiFi-Registry. These becomes snippets of your overall end-to-end dataflow. So let's your "transform" sub dataflow is reusable with just a few modifications, others could easily reuse it by importing from NIFi-Registry or deploying a shared flow definition. Typically users of NiFi will create a process group per unique end-to-end datflow or will create a unique process group per team to separate dataflows and control access per process group so team 1 can't mess with team 2's process group. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more