About MattWho

MattWho · ‎09-26-2024

@Vikas-Nifi Your dataflow is working as designed. You have your listFile producing three FlowFiles (1 for each file listed). Each of those FlowFiles the trigger the execution of your FetchFile which you have configured to fetch the content of only one those files. If you only want to fetch "test_1.txt", you need to either configure the listFile to only list file "test_1.txt" or you need to add a RouteOnAttribute processor between your listFile and FetchFile so that you are only routing the listed FlowFile with ${filename:equals{'test_1.txt')} to the FetchFile and auto-terminating the other listed files. The first option of only listing the file you want to fetch the content for is the better option unless there is more to your use case then you have shared. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎09-25-2024

@imvn A NiFi FlowFile consists of two parts: FlowFile content - The content of a FlowFile is written to a content claim within the NiFI content repository. A content claim is immutable. FlowFile Attributes/metadata - Attributes/metadata are written to the flowfile repository and persists in NiFi heap memory unless forced to swap due connection thresholds. This metadata includes information about where to find the content among other things. How NiFi processor handle inbound and outbound FlowFiles is largely processor specific. Fo processors that write output to the content of a FlowFiles, this may be handle two different ways depending on processor. Some processor might have an "original" relationship where the original FlowFile referencing original inbound content claim gets routed while creating a new FlowFile with same attributes pointing to new content output in a different claim and routing to some other relationship like "success". Other processor might not have an "original" relationship and instead decrement a claimant count on the original content claim and update the existing FlowFile metadata to point to the content created in the new content claim. The ExecuteSQL processor follows the latter process. So you have a dataflow built like this if i understand correctly: ExecuteSQL (writes content to FlowFile) --> some processor/processors (extract bits from content to use for Delete) --> ExecuteSQL (performs delete but response is written as new content for the FlowFile) --> PutDatabaseRecord (has issues since original needed FlowFile content is no longer associated with the FlowFile). Option 1: Can you re-order yoru processors so you have ExecuteSQL. --> PutDatabaseRecord --> Extract content --> ExecuteSQL (Delete) This makes sure orginal content is persisted long enough to compete write to target DB. Option 2: ExecuteSQL --> ExtractContent --> Route "success" relationship twice (once to ExecuteSQL to perform delete ad second to PutDatabaseRecord to write to DB). Similar to below example: You'll notice that the "matched" relationship has been routed twice. When the same relationship is routed twice, NiFi clones the FlowFile (original goes to one connection and clone goes to the other. Both FlowFiles reference the same content claim (which remember is immutable). When ExecuteSQL (delete) the executes on one of them it does not impact the content in the other one that is going to PutDatabaseRecord. If I am not clear on your use case, let me know. I was a bit confused on the "delete data from destination" part. Destination = the putDatabaseRecord configured DB destination? not clear why you would be deleting something there that has not yet been written. So if there is a dependency that the ExecuteSQL (Delete) completes before the PutDatabaseRecord executes, there is a third option that utilizes the "FlowFile concurrency" and "outbound policy" settings on a Process Group. The dataflow would look something like this: Inside the Process Group configured with "FlowFile concurrency = Single FlowFile Per Node" and "outbound policy = Batch Output" set, you would have this flow: So you Dataflow only allows 1 FlowFile to enter the PG at a time. Within the PG, the FlowFile is cloned with one FlowFile routing to the output port and the other to the ExecuteSQL (delete). the FlowFile queued to exit PG will not be allowed to exit PG until the FlowFile being processed by the ExecuteSQL (delete) is auto-terminated or routed to some other output port. This makes sure that the PutDatabaseRecord processor does not process the FlowFile with the original content claim until your ExecuteSQL (delete) was executed. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎09-25-2024

@Twelve @aLiang The crypto.randomUUID() issue when running NiFi over HTTP or on localhost has been resolved via https://issues.apache.org/jira/browse/NIFI-13680. The fix will be part of next release after NiFi-2.0.0-M4. Thanks, Matt

MattWho · ‎09-24-2024

@Ashi Potential option: What record Reader and record writers are you using in your UpdateRecord processor? What schema are you using for your records? In order to add a new field, that new field needs to be defined in the records schema. In your case the schema must contain the field "devicename". Prior to UpdateRecord, you could use perhaps an ExtractText processor to extract the "rc01;rik2jc" value from the meHostName field to a flowfile.attribute. Then will you be able to use UdpateRecord to apply a value to that new record field in the record writer. Property: /devicename value: ${flowfile.attribute:substringAfter(';')} Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎09-18-2024

@Crags You can not have both your NiFi-Registries linked to the same Git repository. NiFi-Registry only pushes to the git repository. The only time NiFi-Registry would ever read from the Git Repository is on startup. So if you used two NiFi-Registries and and were committing changes by both, you can cause issues with what is getting committed to your Git repo. What is more common is to have a single NiFi-Registry which is utilized by multiple NiFi deployments. QA NiFi builds some flow and when that flow is ready for production, it is committed to the NiFi-Registry. That flow can the be imported from that single NiFi-Registry to the canvas of your PROD NiFi. Now both NiFi instances are tracking to same flow in same registry. You then start making local changes to that same version controlled Process Group (PG) in your QA NiFi. The PG will indicate you have local changes. you then have a couple choices on how you want to use your shared NiFi-Registry: Wait until you have completed making all your changes and testing in QA before committing the next version to the shared registry. At which time your Prod NiFi PG will indicate a newer version is now available in the shared NiFi-Registry. You can then update your prod to that new version. Incrementally commit updated versions of the PG to the shared registry. Your prod will show new version available, so you will want to create a process for what versions are prod ready to control when a new version is actually changed in your prod. About the UUID linkage... Your NiFi can have one or more defined registry clients and each of those defined registry clients gets an assigned UUID on the NiFi instance (will not be same UUID on every NiFi that sets up same registry client). NiFi stores everything on the canvas locally in the flow.json.gz file so it can be reloaded into NiFi heap on startup. When you start version control on a PG, the flow (gets uuid) is added to a NiFi-Registry bucket (has UUID). Locally on the NiFi within the flow.json.gz there is now a reference to a specific NiFi-Registry client (by its UUD), a specific bucket (by its UUID) and specific flow (by its UUID). Now considering scenario of a shared NiFi-Registry, the registry client on that NiFi will hav a different uuid even though it connects to same shared NiFi-Registry. So using the registry client, you import a flow that flow from NiFi-Registry to the NiFi canvas. Every component created from the import flow will get assigned UUIDs (will not match UUIDs assigned on other NiFi). Those differences in UUIDs are not tracked as changes. This is why if you stop version control, you can't start version control again and connect it back to an existing flow stored in NiFi-registry. You also can't delete the registry client and re-create it as it too would get a different UUID (NiFi blocks removing a registry client if any PG are currently using it for version control for this reason). --------- Another option is to have a separate NiFi-Registry for each environment. When you are ready to move a flow from NiFi-Registry 1 (QA) to NiFi-Registry 2 (Prod), go into your QA NiFi-Registry, locate the flow and from the "actions" menu select export version, and select the version you want to export. You can then go to your prod NiFi-Registry and "import new flow". Once imported you can go to yoru Prod NiFi and load that flow onto the canvas. Later when you are ready in QA with a new version to push to prod, you can again export the prod ready version. On prod NiFi-Registry, you can select the existing flow and from "actions" menu select "import new version". This will allow you to add this flow as next version in Prod. After doing so the version controlled PG(s) on your prod NiFi tracking against that flow will report a new version is available. This second option allows your have better control over what changes make it to your Prod deployment. You could also script rest-api calls to automate these steps if you wanted. ------ Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎09-18-2024

@abhinav_joshi You should have been able to right click on the "Ghost" processor and select "change version" option. This would have presented you with all the available versions in your NiFi installation. Simply select the one you want to use would resolve your issue. While this work great when you only have a few ghost processor created from your dataflow, it can be annoying to follow these steps for many components. The question here is why does you deployment of NiFi have multiple versions of the same NiFi nar installed. NiFi would not ship this way, so that means that additional nar(s) of different versions where added to your NiFi lib directory or to the NiFi extensions directory. You should remove these duplicate nars to avoid running into this issue again. When only one version exists, dataflow imported/loaded with older versions will auto switch to version used in the NiFi in which dataflow was loaded (this may mean and older or newer version of nar classes). Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎09-17-2024

@rizalt There is very little detail in your post. NiFi will run as whatever user is used to start it unless the "run.as" property is set in the NiFi bootstrap.conf file. If the user trying to execute the "./nifi.sh start" command is not the root user and you set the "run.as" property to "root", that user would need sudo permissions in linux to start NiFi as the root user. The "run.as" property is ignored on Windows where the service will always be owned by user that starts it. NOTE: Starting the service as a different user then it was previously started at will not trigger a change in file ownership in NiFi directories. You would need to update file ownership manually be starting as a different issue (this includes all NiFi's repositories). While "root" user has access to all files regardless of owner, issues will exist if no root user launches app and files are owned by another user including root. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎09-17-2024

@Chetan_mn I loaded up the latest NiFi-2.0.0-M4 (milestone 4 release). Loaded up my flow definition used in my NiFi 1.23 version. All seems to work fine sending headers with mix case and seeing the correct attributes created with those mix case headers on FlowFile generated by HandleHTTPRequest processor. InvokeHTTP: You'll see two custom headers (displayName and outerID) added above as dynamic properties. HandleHTTPRequest processor: When I "List Queue" on the connection containing the "Success" relationship from HandleHTTPRequest processor and "view details" the queued FlowFile, the FlowFile attributes look correct. Are you saying you see different? Try using NiFi 2.0.0-M4 (latest) to see if experience is same. At what point in your dataflow are you checking the validating the FlowFile Attributes. Is your custom script maybe handling them wrong? I am not seeing an issue in the HandleHTTPRequest processor with regards to HTTP Header handling. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎09-12-2024

@Chetan_mn While I do not have an install currently of Technical Preview NiFi 2.0 milestone2 release, I used a NiFi 1.18 to build a simple dataflow using HandleHTTPRequest. I then setup an invokeHTTP processor to send a message to to that api endpoint using the PATCH http method. I also include a couple custom headers: displayName=Display1 outerID=123456aBcD When I inspected the received FlowFile from HandleHTTPRequest, I see the FlowFile attributes created from the headers look correct: I suggest you try using an InvokeHTTP processor to test your HandleHTTPRequest processor in Apache NiFi 2.0.0-M2 to make sure your issue is not the result of some external manipulation of the headers before they are received by the HandleHTTPRequest processor. The headers are just create as FlowFile Attribute property names. I am curious how the all lowercase of these property names are impacting your dataflow? Are the values for your headers being modified? Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎09-12-2024

@Techie123 When you say "run it manually", does that mean you simply start the processor and allow it to run continuously or are you right click on the processor and selecting "run once"? How do you have the "scheduling" configured for the processor? I assume you are trying to use cron? Thank you, Matt

Online	Offline
Last Visited	‎05-18-2026 10:55 AM

Member Since	‎07-30-2019 10:41 AM
Last Visited	‎05-18-2026 10:55 AM
Posts	3,470
Kudos received	1637

Cloudera Community

Re: How to invoke a url in nifi which is protected...

Re: Retry impacts scheduler

Re: 503 error while copying/versioning big process...

Re: FetchSMB not fetching all files

Re: Nifi: How to revoke the import and export Temp...

Re: NiFi ListFile/ListSFTP + FetchFile/FetchSFTP i...

Re: Apache nifi simple flow

Re: Issue with NiFi 2.0.0-M4: crypto.randomUUID is...

Re: Fetch a value from diffrent field in NIFI

Re: Migrating NiFi Registry from one server to ano...

Re: Update Attribute processor Not working after u...

Re: NIFI How to run nifi as user root

Re: Header request attributes not being passed wit...

Re: Header request attributes not being passed wit...

Re: ConsumeImap Processor is not working properly ...