Created 05-21-2021 01:25 PM
Hello,
I'd like to implement a lock mechanism in a data flow that prevents a flow file from progressing. Similar in a way to the wait processor, holding a flowfile in a queue while another flowfile with a corresponding signal attribute passes through a 'gate' further down the flow, similar to a notify.
In my specific case I load a number of files received together in to target tables.
I have a single flow which does the following:
A number of these files, share the same staging and target tables and on occasions I have a race condition where the the staging table is truncated by a second flowfile before the stored procedure has had time to run for the first.
There are never more than two flowfiles destined for the same target table currently, so I'm able to mitigate the issue by routing one set to a retry processor and penalising the flowfiles for an arbitrary amount of time to give the first set time to complete their load. This isn't particularly elegant and it doesn't feel like it would scale very well. Also, my situation is likely to change, and I could have more than two common files in the future.
To my mind, if I could have a wait processor check a cache when a flowfile passes through it, and if the cache is empty it adds the target table name value to the cache as a signal. After the third step above is complete for this flowfile, it passes through a notify type processor which removes the signal attribute from the cache releasing any flowfiles with a matching attribute from the queue on the earlier wait.
I don't think I can get the wait and notify to work in this manner, but I could be wrong? If so, is there another way to achieve this type of functionality>
I'm aware of the Processor Group FlowFile Concurrency and Outbound Policy settings, but these would be too restrictive only processing a single flow file at a time. I would like to only process as many flowfiles concurrently as the DB will take, only holdiing the flowfiles which might cause the race condition.
I hope that's clear. Thanks in advance.
Created 06-02-2021 09:30 AM
@_mark_
As NiFi is an open source product, I recommend joining the community (if you have not already) and opening an Apache NiFi Jira [1] with you proposed enhancements/new features for Apache NiFi to get feedback from the community at large.
If you feel you are not there yet in proposing a new feature/enhancement, try engaging via the users mailing list [2]
[1] https://issues.apache.org/jira/browse/NIFI
[2] https://nifi.apache.org/mailing_lists.html
Thanks,
Matt
Created 05-30-2021 09:56 AM
I hope a bump to the top of the recent post list isn't breaking any rules, but I'm hoping someone might be able to offer an opinion, so a bump it is!
Created 06-02-2021 09:30 AM
@_mark_
As NiFi is an open source product, I recommend joining the community (if you have not already) and opening an Apache NiFi Jira [1] with you proposed enhancements/new features for Apache NiFi to get feedback from the community at large.
If you feel you are not there yet in proposing a new feature/enhancement, try engaging via the users mailing list [2]
[1] https://issues.apache.org/jira/browse/NIFI
[2] https://nifi.apache.org/mailing_lists.html
Thanks,
Matt
Created 06-03-2021 06:32 AM
Hi @MattWho ,
Thanks for the response.
I'll happily create a feature request, though I wasn't sure if I was missing something obvious that would meet my objective/requirements.
Thank you for clarifying, I'll go take a look at the links you provided (admittedly I had missed the mailing lists, oops).
Kind regards
Mark