Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Apache Nifi: how to ensure record order between processors

avatar
Explorer

Dear Community!

 

I have a simple flow in Apache NiFi 1.20:

1. QueryDatabaseTableRecord -> 2. SplitAvro -> 3. ConvertAvroToJson -> 4. LogAttribute -> 5. PutDatabaseRecord

 

Flow descrition:

1. Read records from a table order by "updated" column (a datetime with millisecs).

2. put each of records into single flowfile

3. Convert a record into JSON

4. Log JSON into log file.

5. insert a record to different database

Each of processors have only 1 single concurrent thread on SETTING page!

 

Everything works fine, except sometime the order of the records will be changed according to "updated" column.

 

Could you please find me the cause? How can be ensured the order of records right?

 

1 ACCEPTED SOLUTION

avatar
Community Manager

Hi @scheeri, I'm not an expert but maybe this well written reply by @MattWho will get you closer. https://community.cloudera.com/t5/Support-Questions/Ensuring-of-order-of-flow-files-in-Nifi/m-p/3143... 


Cy Jervis, Manager, Community Program
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

View solution in original post

3 REPLIES 3

avatar
Community Manager

Hi @scheeri, I'm not an expert but maybe this well written reply by @MattWho will get you closer. https://community.cloudera.com/t5/Support-Questions/Ensuring-of-order-of-flow-files-in-Nifi/m-p/3143... 


Cy Jervis, Manager, Community Program
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

avatar
Explorer

Thank you for your answer!

It was not tested by me, but probably there was problem with prioritizing, because it has to be set directly by hand. It is strange, because a queue should has FIFO prioritized by default, i think.

avatar
Super Mentor

@scheeri 

 

FIFO is the default priority of a connection queue; however, if you have more then one concurrent task on a processor, multiple FlowFiles from the source connection can be executed about concurrently.  This means that one of those concurrent execution may complete before the other leading to FlowFile no longer being in same order in follow-on connections.

 

If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped.

Thank you,

Matt