Member since
07-30-2019
3432
Posts
1632
Kudos Received
1012
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 116 | 01-27-2026 12:46 PM | |
| 513 | 01-13-2026 11:14 AM | |
| 1136 | 01-09-2026 06:58 AM | |
| 958 | 12-17-2025 05:55 AM | |
| 469 | 12-17-2025 05:34 AM |
10-23-2022
03:55 AM
How did anyone solve this issue? I was currently trying to solve similar issue via nifi api (Postman)
... View more
10-21-2022
01:25 PM
@DGaboleiro I am a bit confused by yoru dataflow design. In a NiFi multi-node cluster, each node is only aware of and can only execute upon FlowFiles present on that one node. So in your Dataflow you have the QueryCasandra processor executing on "primary node" only as you should (having it execute on all nodes would result in both your nodes performing same query and returning same data). You then Split that Json and use a DistributeLoad processor for what appears to me as means to then send some FlowFIle to node 1 and other half to node 2. This is not the best way to do this. You are running Apache NiFi 1.17 which means that load balanced connections are possible that can accomplish the same without all these additional processors. https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#settings After your FlowFiles (this is what is being moved from processor to processor on your canvas) have been distributed I see that you are a MergeContent processor. The MergeContent processor can only merge the FlowFiles present on the same node. It will not merge FlowFiles from multiple nodes to a single FlowFile. So if your desire is to have one merge of all FlowFiles, distributing them across multiple nodes will not give you that desired outcome. You should never configure any processor that accepts an inbound connection for "primary node" only execution. This is important since which node is elected as primary node can change at anytime. Execution strategy has nothing to do with the availability of FlowFiles on each node on which to execute. What is important to understand is that each node in yoru NiFi cluster has its own copy of the Flow, its own set of Content and FlowFile repositories contain unique data, and each nodes executes the processors in its flow with no regard of the existence of other nodes. A node is simply aware from Zookeeper if it has been elected as the cluster coordinator and/or primary node. If it is elected primary node, it will execute "primary node" and "all nodes" components. If it is not the primary node, it will only execute the "all nodes" components. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
10-21-2022
12:56 PM
@rangareddyy What is important to understand is that the NiFi component processors are not being executed by the user authenticated (assuming secured NiFi) into NiFi, but rather by the NiFi service user. So let's say that your NiFi service is owned by a "nifiservice" linux account. Whatever umask is configured for that user will be applied to directories and files create by that user. Now if your script is using sudo, it is changing the user that executes your script resulting in different user ownership and permission from the "nifiservice" user. Subsequent component processors will also execute as the "nifiservice" user and then not have access to those files and directories. So you'll need to take this in to account as you built your scripts. Make sure that your scripts are adjusting permissions on the directory tree and files as needed so your "nifiservice" user or all users can access the files needed downstream in your dataflows. So in yoru case it sounds like your script executed by ExecuteScript processor is creating a sh file not owned by the "nifiservice" user or does not have execute permission set on it. The ExecuteStreamCommand processor will attempt to execute the sh command on disk as the "nifiservice" user only. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
10-21-2022
12:28 PM
@Fredi A screenshot of the configuration of your UpdateAttribute processor including main configuration and configuration in the "Advanced" UI would be very helpful in understanding your setup and issue. Thanks, Matt
... View more
10-19-2022
08:25 AM
@orekxl @biblio_gr The following community article will help you understand what really happens when a user chooses to click on "terminate" on a stopping NiFi processor with active threads" https://community.cloudera.com/t5/Community-Articles/Understanding-NiFi-s-quot-Terminate-quot-option-on-running/ta-p/355433 If you found this assisted you with your query, please take a moment to login and click "ACCEPT as Solution" below this response. Thank you, Matt
... View more
10-19-2022
08:20 AM
2 Kudos
The intent of this article is cover exactly what happens when a user clicks the "terminate" button on a processor component that has an actively running task. Before we can discuss the "terminate" option, we need to understand a few basics about the NiFi application and a bit of history: 1. NiFi is a java application and the execution of any component (processors, controller service, reporting tasks, funnel, input/output ports, etc) happens within that single Java Virtual Machine (JVM) process. NiFi does not create a child process fro the execution of each component. 2. Since NiFi operates within a single JVM, it is not possible to "kill" a thread for an individual component without killing the entire JVM. 3. NiFi consists of well over 400 unique components and many of them are not executing native NiFi code. Many use client libraries not managed or controlled by NiFi. Others can be configured to execute command external to NiFi (ExecuteStreamCommand, ExecuteProcess, ExecuteScript, etc). Processors that invoke something external to NiFi's code base will result in a child process being created with its own pid. Keep in mind that processors of this type do not limit what externally is being invoked so take a generic approach to handling those child processes. The JVM invokes the external command and waits for it to respond complete. 4. Historically NiFi did not offer a terminate option since killing a thread in the NiFi JVM wis not possible. So when a component misbehaved (usually do to an issue external to NiFi code like network, client library hung, external command hung, etc), that NiFi component processor would get stuck just with the JVM thread waiting on that client library or external process to return. As such, the processor's concurrent task JVM thread is blocked. While you could select to stop the processor that would not help users get past the hung or long running thread. NiFi processors transition to a "stopping" state where it will remain until that library or task it is waiting on completes. Until that happens, users would not be able to modify the configuration or restart the component. This meant for truly hung issues the component would be blocked until the NiFi JVM was restarted. 5. As a result of the inconvenience/impact a hung thread causes, NiFi introduced the "terminate" option on a "stopped" component with an active thread. What Actually happens when a user clicks "terminate": 1. "Terminate" is only possible when after a processor has been asked to stop and that stopped processor still has associated JVM thread running. 2. Since we know that killing a JVM thread is not possible without killing the entire JVM process (NiFi), the "terminate" option takes a different approach. When a processor executes, it is doing so typically in response to inbound queued FlowFile as the trigger. That means the inbound FlowFile is tied to the JVM thread that is executing. When the thread completes, that FlowFiles (or modified, cloned, new FlowFile depending on processor function) is moved to the appropriate outbound relationship of the processor. 3. So what the "terminate" function really does is releases the FlowFile associated to that running JVM thread back to the inbound connection, makes request to client library or external command to abort/exit, and then isolates that thread so that if it does actually complete post terminate, all returns are just sent to null. 4. When "terminate" has been selected, the UI will render the processors active threads differently to indicate if the processor has JVM threads that have been terminated but are still active. NOTE: The number within the parenthesis denotes the current number of terminated threads still active. 5. If the client or external command responds to the request to exit, the active "terminated" thread will disappear. If not, it will continue to exist until thread finally completes or the entire NiFi JVM is restarted. NOTE: A terminated thread has little impact on resources since a hung thread isn't consuming CPU. Now a long running CPU intensive thread may have impact. 6. Now that this "terminate" JVM thread has been isolated and any FlowFile(s) tied to that thread have been released to originating connection, users can modify the processor configuration and start the component processor again. When started again, the processor will execute again on the FlowFile(s) that once belonged to the terminated thread. So no dataloss is incurred as a result of using "terminate". The "terminate" capability allows users to move on without needing to restart their NiFi JVM, thus reducing downtime and impact to other dataflows running on the NiFi canvas. If you have a processor that constantly has hung process issues or has very long running threads, it is time to start looking at your source FlowFile(s), processor configuration, external command, or external service the processor may be waiting for a response from as possible sources of the issue. Reference: Apache NiFi Terminate documentation
... View more
Labels:
10-10-2022
07:20 AM
@knighttime Has your issue been resolved? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.
... View more
10-07-2022
07:51 AM
@MattWho Thank you very much, this worked perfectly for what I was needing. Can you tell me if this same flow also works with windows?
... View more
10-04-2022
12:13 AM
ResultInvoke in this case doesn't have any value, it's sent as a header with '${invokehttp.status.code}' as a value in http request. To assign status code to ResultInvoke use UpdateAttribute after the InvokeHTTP.
... View more
10-03-2022
01:59 AM
Hi and thanks for your reply. Integration with OIDC and NiFi it's no easy. I've tried you suggestion but don't works, seems the header don't follow the flow. But now, I've resolved using the certificate when I call the url. I've trusted zabbix certificate with nifi and use this curl: curl https://nificluster.info/nifi-api/flow/cluster/summary --insecure -H "Host: nificluster.info" --cert /pathcertificate/certificate_zabbix.pem --key /pathcertificate/certificate_zabbix.key Now I can check the cluster status "without" login.
... View more