About MattWho

MattWho · ‎05-20-2024

Hello @hegdemahendra Always very helpful if you include the exact version of Apache NiFI, Cloudera HDF, or Cloudera CFM being used. My guess here would be one or both of the following: You have multiple FlowFiles all pointing at the same content claims queued in connections within your dataflow(s) on the canvas. As long as a FlowFile exists on the canvas it will exist in flowfile_repository. Users should avoid leaving FlowFiles queued in connection on NiFi. Some users tend to allow FlowFile to accumulate at stopped processor components rather then auto-terminate them. Even if a FlowFile does not have any content its FlowFile attributes/metadata still consume disk space. You are extracting content from your FlowFiles into FlowFile attributes resulting in large FlowFile attribute/metadata being stored in the flowfile_repository. Dataflow designers should avoid extracting large amounts flowfile content in to the FlowFile's attributes. Instead try to build dataflows and utilize components that read content from the FlowFile's content instead of from FlowFile attributes. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎05-20-2024

@galt @RAGHUY Let me add some correction/clarity to the accepted solution. Export and Modify Flow Configuration: Export the NiFi flow configuration, typically in XML format. This can be done via the NiFi UI or by utilizing NiFi's REST API. Then, manually adjust the XML to change the ID of the connection to the desired value. It is not clear here what is being done. The only way to export a flow configuration from NiFi in XML format is via generating a NiFi template (deprecated and removed in Apache NIFi 2.x versions). Even if you were to generate a template and export is via NiFi UI or NiFi's rest-api, modifying it will not change what is on the canvas. If you were to modify the connection component UUID in all places in the template. Upon upload of that template back in to NiFI, you would need to drop the template on the the canvas which would result in every component in that template getting a new UUID. So this does not work. In newer version of NiFi 1.18+ NiFi supports newer flow definitions which are in json format. but same issue persists here when using flow definitions in this manor. In a scenario like the one described in this post where user removed a connection by mistake and then re-created it, the best option is to restore/revert the previous flow. Whenever a change is made to the canvas, NIFi auto archives the current flow.xml.gz (legacy) and flow.json.gz (current) file in to an archive sub-directory and generates a new flow.xml.gz/flow.json.gz file. Best and safest approach approach is to shutdown all nodes in your NiFi cluster. Navigate to the NiFi conf directory and swap current flow.xml.gz/flow.json.gz files with the archived flow.xml.gz/flow.json.gz files still containing the connection with original needed ID. When above is not possible (maybe change went unnoticed for to long and all archive version have new connection UUID), you need to manually modify the flow.xml.gz/flow.json.gz files. Shutdown all your NiFi nodes to avoid any changes being made on canvas while performing following steps. Option 1: Make backup of current flow.xml.gz and flow.json.gz Search each file for original UUID to make sure it does not exist. On one node manually modify the flow.xml.gz and flow.json.gz files by locating the current bad UUID and replacing it with the original needed UUID. Copy the modified flow.xml.gz and flow.json.gz files to all nodes in the cluster replacing original files. this is possible since all nodes run same version of flow. Option 2: same as option 1 same as option 1 same as option 1 Start NiFi only on the node where you modified the flow.xml.gz and flow.json.gz files. On all other nodes still stopped, remove or rename the flow.xml.gz and flow.json.gz files. Start all the remaining nodes. since they do not have a flow.xml.gz or flow.json.gs to load, they will inherit the flow from the cluster as they join the cluster. NOTE: The flow.xml.gz was replaced by the newer flow.json.gz format starting with Apache NiFi 1.16. When NiFi is 1.16 or newer is started with and only has a flow.xml.gz file, it will load from flow.xml.gz and then generate the new flow.json.gz format. Apache NiFi 1.16+ will load only from the flow.json.gz on startup when that file exists, but will still write out both the flow.xml.gz and flow.json.gz formats anytime a change is made to the canvas. With Apache NiFi 2.x+ version the flow.xml.gz format will go away. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎05-20-2024

@jpconver2 @RAGHUY All 1.x versions of NiFi do not support rolling upgrades. With the major release of the NiFi 2.x versions, NiFi added rolling upgrade support as part of NIFI-12016 - Improve leniency for bundle compatibility to allow for rolling upgrades. The above Apache NiFi jira does a great job of explaining why historically in the NiFi 1.x branch this was not implemented. Above new feature improvement is included in NiFi 2.0.0-M1 (NiFi 2.0.0 milestone 1) or newer. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎05-20-2024

@SAMSAL This is not a new problem, but rather something that has existed with NiFi on Windows fro a very long time. You'll need to avoid using space in directory names or warp that directory name in quotes to avoid the issue. NIFI-200 - Bootstrap loader doesn't handle directories with spaces in it on Windows Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎05-20-2024

@Lorenzo Based on log output shared the Spnego based authentication was successful and you have an authorization problem for your Spnego authenticated user. NiFi Authorization is case sensitive, so the user identity returned via kerberos-provider login provider is likely not the exact same user identity string returned via Spnego based kerberos authentication. "myuserad" is a different user identity then "myaduser" and different user identity then "MyAduser" and different user identity then "[email protected]" and .etc... NiFi provides identity mapping properties which can be used to manipulate the user identity returned by different user authentication methods before the final manipulated user identity is passed over the the NiFi authorizer to check for proper authorization(s). These are added to the nifi.properties file: Identity Mapping Properties NOTE: keep in mind that mapping patterns are checked against the user identity output during authentication in an alpha-numeric order. First pattern (regex) to match has its value and transform applied at which time not additional mapping patterns will get evaluated. So as your pattern regular expressions get more generic the farther down the alpha-numeric list they need to be. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎05-20-2024

@Racketmojster A NiFi FlowFile consists of FlowFile Content (physical bytes of the source content) and FlowFile Attributes (Attributes about the content and FlowFile). You did not share how you are retrieving the parent folder: My nifi setup is to retrieve the date folders inside the parent directory of Logfolder I assume maybe you tried to use getFile or FetchFile? Also not very clear on what you are trying to do from this statement: date folders as a whole is passed to the python script A detailed use case might help here. NiFi works with content and directories are not content. They are tracked in NiFi more as attributes/metadata related to the content ingested. If you look at the FlowFile attributes on your FlowFile created for logs.txt, you should see an "absolute.path" attribute that would have the full path to the logs.txt file. If that is all you need and you have no need to actually get the content of the logs.txt file itself in NiFi, you could use juts the ListFile processor to get only the metadata/attributes about the found file and pass the value from "absolute.path" to your executeStreamCommand script. If each of your date folders contains multiple files, you would need to design a dataflow that accounts for that and eliminates duplicates so you only execute your ExecuteStreamCommand script. For example use the ReplaceText processor (Always Replace strategy) to write the "absolute.path" value or date portion of the full path to the content of the FlowFile, then use DetectDuplicate processor to purge all duplicates before your ExecuteStreamCommand processor. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎05-06-2024

@jonay__reyes or @DeepakDonde The issue you are encountering if caused by a code change so the the invokeHTTP would encode URLs automatically. This issue is triggered if your URL is already encoded. The URL encoding change will convert all '%' to '%25'. Workarounds/solutions: Remove your url encoding and allow processor to do that encoding. If your URL is not URL encoded already and happens to contain '%' in the URL you can do the following: If using Apache NiFi 1.25 verions: Upgrade to NIFi 1.26 which contains fix NIFI-12842. (now released) Try adding Apache NiFi 1.26 version of the NiFi standard nar to your 1.25 install. Downgrade to Apache NiFi 1.24 If using Apache NiFi 2.0.0-M2 Wait for upcoming release of NiFi 2.0.0-M3 and upgrade, which will contain fix. Downgrade to Apache NiFi 2.0.0-M1 Try adding Apache NiFi 2.0.0-M1 Standard nar to your 2.0.0-M2 install and switch to using older 2.0.0-M1 invokeHTTP processor. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎05-06-2024

@jonay__reyes Apache NiFi 1.x and NiFi 2.x are major release versions, I would not expect that you could mix component versioned nars successfully between those versions. My suggestion in the above post was you could try: 1. If using Apache NiFi 1.25, you could add the 1.24 nar. 2. If using Apache NiFi 2.0.0-M2, you could add the Apache NiFi 2.0.0.-M1 nar. The NiFi standard nar contains so many core libraries and components, I can't guarantee it will load successfully, but have done is successfully in the older releases of Apache NIFi 1.x versions. Never tried with Apache NiFi 2.x major milestone (M1, M2, ...) release versions. With above being said, Nothing you shared above appears to be an ERROR. So I am no clear in exactly what you mean when you say "doesn't seem to work". Did not startup? Could the UI be accessed? When adding a new component do you see multiple versions for the components included in the standard nar? When you load in to duplicate nars of different versions, NiFi does not change anything about the components loaded on the canvas. What you should see when dragging and adding new components to the canvas is multiple version options for the same component class to choose from. If you are still using the 2.0.0-M2 version of invokeHTTP, it is still going to have issues. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎05-03-2024

@manishg The elected cluster coordinator (elected by Zookeeper) is responsible for receiving and processing heartbeats from other nodes in the cluster. It handles the connecting, reconnecting, and manual disconnecting of NiFi nodes. The Cluster coordinator is also responsible for replicating user request to all nodes in the cluster and get confirmation from those nodes that the request was completed successfully. Assume a 3 node cluster with following: node1 - elected cluster coordinator node2 - elected primary node node3 Role of Cluster Coordinator: A user can access the NiFi cluster via any of the 3 node's URL. So lets say a user logs in node3's UI. When that user interacts with node 3 UI that request is proxied to the currently elected cluster coordinator node that in turn replicates the request all 3 nodes (example: add a processor, configure a processor, empty a queue, etc...). If one of the nodes were to fail to complete the request, that node would get disconnected. I may attempt to auto-reconnect later (In newer version of NiFi a connecting node can inherit the clusters flow and replace it local flow only if doing so would not result in dataloss. Role of the Primary Node: The elected primary node is responsible for scheduling the execution of any NiF component processor on the canvas that is configured for primary node only. This is configured in a processor's configuration "scheduling" tab: Primary node scheduled processors with display a "P" in the upper left corner as seen above. NOTE: ONLY processors with no inbound connections should ever be set to "primary node" execution. Doing so on processor with inbound connection can lead to FlowFiles becoming stuck in those connection when the elected primary node changes. Not all protocols are "cluster" friendly, so primary node execution helps dataflow designers work around that limitation while still benefiting from having a multi-node cluster. NiFi has numerous "List<XYZ>" and "Fetch<XYZ>" type processor typically used to handle non cluster friendly protocols. I'll use ListFile and FetchFile as an example. Let say our 3 node cluster as the sane network directory mounted to every node. If I was to add the ListFile processor and leave it configured with "all nodes" execution and configure it to list files on that shared mount. All three nodes in the NiFi cluster would produce FlowFiles for all the files listed (so you have files in triplicate). Now if i were to configure my ListFile with "primary node" execution, the listFile would only get scheduled to execute on the currently elected primary node (these processor also record cluster state in ZK so that if a elected primary node changes it doe snot result in a re-listing of the same files again). To prevent overloading the primary node, the list based processors do not retrieve the source content. It only creates a 0 byte FlowFile with attributes/metadata about the source file. So the List based processor would then be connected downstream to its corresponding FetchFile processor. The FetchFile for example would the use the metadata from the 0 byte FlowFile to fetch the content and add it to the FlowFile. On the connection between ListFile and FetchFile you would configure cluster load balancing. Here you can see I selected basic round robin. You'll notice a connection with load balancing configured will also render a bit different: What happen on this connection is that all the 0 bytes FlowFiles will be redistributed in round robin style to all connected nodes. Then on each node the FetchFile will get each nodes subset of FlowFiles content. This reduce need to transmit content of network between nodes and reduces disk IO on primary node since it is not fetching all the content. If you search the Apache NiFi documentation you will see many list and fetch combination type processors. But any source processor (one with no inbound connection) could be configured for primary node only. But only schedule a source processor as primary node execution if required. Doing so on processors like ConsumeKafka for example that uses cluster friendly protocols would just impact performance. Hope this answers your question only what the difference is between Cluster Coordinator and Primary Node roles in a NiFi cluster. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎05-03-2024

@manishg What reason is NiFi giving for the node disconnection? Go to NIFi global menu in upper right corner of UI of node that is still connected to cluster and selected "cluster": From the new UI you will see a list of your nodes. To the left of nodes you will see a small "view details" icon. Click on that for one of the nodes that experienced a disconnection (It might currently be connected). This will open a new UI that will contain node events. Node disconnections and reconnection with reason will be provided here. Probably the most common unexpected disconnect reason is "lack of heartbeat". Within the nifi.properties file you can configure the node heartbeat interval (nifi.cluster.protocol.heartbeat.interval) in the cluster section. The default is 5 secs and same value must be set on all nodes in your cluster. This setting control how often a node will attempt to send a heartbeat to the currently elected cluster coordinator node. The elected cluster coordinator expects to receive at least oe successful heartbeat from a node within 8 times the configured heartbeat interval. so with default 5 sec interval the cluster coordinator needs to receive a heartbeat at least once every 40 seconds or the cluster coordinator will disconnect the node. If reason is lack of heartbeat the node events will also tell you the time of that event. NiFi will auto-reconnect a node back in to the cluster if a successful heartbeat is received after a disconnection due to lack of heartbeat occurred. Things like long JVM Garbage Collection events can result in disconnects. Sometimes resolving the issue is as simple as increasing the heartbeat interval allowing more time before a node gets disconnected (for example setting to 15 secs on all nodes). Long garbage collection pauses can happen with large heap settings, large flows processing lots of FlowFiles. From the same Cluster UI, you can select the "JVM" tab to see basic details about the JVM on each node and see JVM GC details. This help you identify if GC is disproportionate amongst your nodes. CPU saturation on a node can also affect the heartbeat interval. From the same cluster UI you can select the "System" tab which will tell you the number of cores you have per node and the core load average per node. High core load average can result in longer times for any given thread to complete. Often dataflow design can contribute to core load and GC resulting in node disconnections due to lack of heartbeat. Example, one node in your cluster is executing on way more FlowFiles then any of the other nodes. This means that your design is not handling FlowFile distribution well causing one node to do much more work. Also keep in mind that the nifi-app.log will log node events as well and it may help to inspect those logs to see if any other notable logged events happened around that same time. Was the node that got disconnect the currently elected primary node (you could tell by logs in another node reporting it as being elected as primary node just after the previous elected primary node was disconnected. If that pattern is consistent, then your dataflow may heavily use "primary node" only scheduled processors and you are not handling FlowFile load balancing programmatically in your dataflow design(s). Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

Online	Offline
Last Visited	‎07-31-2026 03:09 AM

Member Since	‎07-30-2019 10:41 AM
Last Visited	‎07-31-2026 03:09 AM
Posts	3,473
Kudos received	1638

Cloudera Community

Re: ListenNetFlow processor does not decode Cisco ...

Re: Can we detect who did a particular operation i...

Re: How to invoke a url in nifi which is protected...

Re: Retry impacts scheduler

Re: 503 error while copying/versioning big process...

Re: Why only flowfile repository disk is getting f...

Re: Manually change connection id apache NiFi

Re: Nifi Rolling Update Strategy in Kubernetes (to...

Re: java.lang.ClassNotFoundException: org.apache.n...

Re: NiFi Rest API via Kerberos token

Re: Apache Nifi passing a folder as an flowfile to...

Re: Invoke Http with url containing %2F

Re: Invoke Http with url containing %2F

Re: Role of primary node

Re: Unstable cluster