Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Attribute based flow lock - Similar to Wait Notify, but not quite.

avatar
Explorer

Hello,

 

I'd like to implement a lock mechanism in a data flow that prevents a flow file from progressing.  Similar in a way to  the wait processor, holding a flowfile in a queue while another flowfile with a corresponding signal attribute passes through a 'gate' further down the flow, similar to a notify. 

In my specific case I load a number of files received together in to target tables.

 

I have a single flow which does the following:

 

  • PutSql - issues a truncate table command that empties the target staging table.  The table name is taken from an attribute.
  • PutDatabaseRecord - Inserts the records into the staging table based on the same attribute.
  • PutSql - executes a stored procedure passing the relevant tables name as parameter from the attribute.  This SP merges data from the staging table into a 'main' table.

A number of these files, share the same staging and target tables and on occasions I have a race condition where the the staging table is truncated by a second flowfile before the stored procedure has had time to run for the first.

 

There are never more than two flowfiles destined for the same target table currently, so I'm able to mitigate the issue by routing one set to a retry processor and penalising the flowfiles for an arbitrary amount of time to give the first set time to complete their load.  This isn't particularly elegant and it doesn't feel like it would scale very well.  Also, my situation is likely to change, and I could have more than two common files in the future.

 

To my mind, if I could have a wait processor check a cache when a flowfile passes through it, and if the cache is empty it adds the target table name value to the cache as a signal.  After the third step above is complete for this flowfile, it passes through a notify type processor which removes the signal attribute from the cache releasing any flowfiles with a matching attribute from the queue on the earlier wait.

 

I don't think I can get the wait and notify to work in this manner, but I could be wrong?  If so, is there another way to achieve this type of functionality>

 

I'm aware of the Processor Group FlowFile Concurrency and Outbound Policy settings, but these would be too restrictive only processing a single flow file at a time.  I would like to only process as many flowfiles concurrently as the DB will take, only holdiing the flowfiles which might cause the race condition. 

 

I hope that's clear.  Thanks in advance.

1 ACCEPTED SOLUTION

avatar
Super Mentor

@_mark_ 

As NiFi is an open source product, I recommend joining the community (if you have not already) and opening an Apache NiFi Jira [1] with you proposed enhancements/new features for Apache NiFi to get feedback from the community at large.

If you feel you are not there yet in proposing a new feature/enhancement, try engaging via the users mailing list [2]

[1] https://issues.apache.org/jira/browse/NIFI
[2] https://nifi.apache.org/mailing_lists.html

 

Thanks,
Matt

View solution in original post

3 REPLIES 3

avatar
Explorer

I hope a bump to the top of the recent post list isn't breaking any rules, but I'm hoping someone might be able to offer an opinion, so a bump it is!

avatar
Super Mentor

@_mark_ 

As NiFi is an open source product, I recommend joining the community (if you have not already) and opening an Apache NiFi Jira [1] with you proposed enhancements/new features for Apache NiFi to get feedback from the community at large.

If you feel you are not there yet in proposing a new feature/enhancement, try engaging via the users mailing list [2]

[1] https://issues.apache.org/jira/browse/NIFI
[2] https://nifi.apache.org/mailing_lists.html

 

Thanks,
Matt

avatar
Explorer

Hi @MattWho ,

 

Thanks for the response.

 

I'll happily create a feature request, though I wasn't sure if I was missing something obvious that would meet my objective/requirements.

 

Thank you for clarifying, I'll go take a look at the links you provided (admittedly I had missed the mailing lists, oops).

 

Kind regards

 

Mark