Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Merge data based on the record count and field values using NiFi

avatar
Explorer

I am fetching data from table for employees and each employees has maximum 7 Records. I am splitting data based on the IDs and week number column. I need to merge the data in such a way that each flow file contains 100 records and all the records of each employees comes into that flow file

for example

My flow file already has 98 records and next employee has 7 records so, that employee's data should not be the part of that flow file. It should come in next flow file likewise.....  While merging the FFs order is also important.

How can I do that. I am not familiar with all the processors of NiFi

1 ACCEPTED SOLUTION

avatar
Super Mentor

@Techie123 

Can you provide more detail around your requirement for "the FFs order is also important".

My initial thought here would be a two phase merge.  In the first Merge you utilize a correlation FlowFile attribute you create on each FlowFile based off the employees ID extracted from the record.  Setting min number of entries to 7 and max to 10.  Then you take these employee merged records and merge them together in to larger FlowFiles using MergeRecord.  The question is if 100 records per FlowFile is a hard limit or not which it does not.

The MergeRecord processor Max number of records is soft limit.  Let's assume we set this to 100. So lets say one of your merged employee records comes to the MergeRecord and has 7 records in it for that employee ID, yet the bin already has 98 records in it.  Since bin min has not been met yet, this merged FlowFile still gets added and results in merged FlowFile with 105 records.  If you must keep it under 100 records per FlowFile set the max records to 94.  If at end of adding a set of merged employee records it is less than 94 another merge employee record would be added and since you stated each set of merged employee records could be up to 7, this keeps you below or at 100 in that single merged record.


If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped.

Thank you,

Matt

View solution in original post

3 REPLIES 3

avatar
Explorer

Hi Team, 

 

can anyone help 

avatar
Super Mentor

@Techie123 

Can you provide more detail around your requirement for "the FFs order is also important".

My initial thought here would be a two phase merge.  In the first Merge you utilize a correlation FlowFile attribute you create on each FlowFile based off the employees ID extracted from the record.  Setting min number of entries to 7 and max to 10.  Then you take these employee merged records and merge them together in to larger FlowFiles using MergeRecord.  The question is if 100 records per FlowFile is a hard limit or not which it does not.

The MergeRecord processor Max number of records is soft limit.  Let's assume we set this to 100. So lets say one of your merged employee records comes to the MergeRecord and has 7 records in it for that employee ID, yet the bin already has 98 records in it.  Since bin min has not been met yet, this merged FlowFile still gets added and results in merged FlowFile with 105 records.  If you must keep it under 100 records per FlowFile set the max records to 94.  If at end of adding a set of merged employee records it is less than 94 another merge employee record would be added and since you stated each set of merged employee records could be up to 7, this keeps you below or at 100 in that single merged record.


If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped.

Thank you,

Matt

avatar
Explorer

Hi @MattWho , I did lot of investigation on the same but was not sure merge record can do that. Thanks a lot for your help.