Created 12-22-2024 10:37 PM
Hi Team,
I am using NIFI 2.0 M4 version-single node , I am trying to merge flow files using defragment merge strategy, Instead of merging all defragment flow files , It is writing first fragment to merged and remaining to failure, Please suggest if i am missing some thing.
NIFI Configuration:
@MattWho ,please help to suggest.
Thanks,
Created 12-23-2024 06:25 AM
@Krish98
Not enough details here to determine what is going on.
Do all your FlowFile that are being merged contain the required FlowFile attributes needed for this to be successful?
Name Description
fragment.identifier | Applicable only if the <Merge Strategy> property is set to Defragment. All FlowFiles with the same value for this attribute will be bundled together. |
fragment.index | Applicable only if the <Merge Strategy> property is set to Defragment. This attribute indicates the order in which the fragments should be assembled. This attribute must be present on all FlowFiles when using the Defragment Merge Strategy and must be a unique (i.e., unique across all FlowFiles that have the same value for the "fragment.identifier" attribute) integer between 0 and the value of the fragment.count attribute. If two or more FlowFiles have the same value for the "fragment.identifier" attribute and the same value for the "fragment.index" attribute, the first FlowFile processed will be accepted and subsequent FlowFiles will not be accepted into the Bin. |
fragment.count | Applicable only if the <Merge Strategy> property is set to Defragment. This attribute indicates how many FlowFiles should be expected in the given bundle. At least one FlowFile must have this attribute in the bundle. If multiple FlowFiles contain the "fragment.count" attribute in a given bundle, all must have the same value. |
segment.original.filename | Applicable only if the <Merge Strategy> property is set to Defragment. This attribute must be present on all FlowFiles with the same value for the fragment.identifier attribute. All FlowFiles in the same bundle must have the same value for this attribute. The value of this attribute will be used for the filename of the completed merged FlowFile. |
Are the FlowFile attribute values correct?
How are the individual FlowFile fragments being produced?
I find it odd that you say 1 FlowFile is routed to "merged". This implies the a merge was success. Are you sure the "merged" FlowFile only contains the content of one FlowFile from the set of fragments?
Any chance you routed both the "original" and "failure" relationships to the same connection? Can you share the full MergeContent processor configuration (all tabs)?
When a FlowFile is routed to "failure", I would expect to see logging in the nifi-app.log related to reason for the failure. What do you see in the nifi-app.log?
How many unique bundles are you trying to merge concurrently? I see you have set "Max Num Bins" to 200. So you expect to have 200 unique fragment.identifier bundles to merge at one time?
How many FlowFiles typically make up one bundle?
I also see you have "Max Bin Age" set to 300 sec (5 mins). Are all FlowFiles with same fragment.identifier making it to the MergeContent processor within the 5 minute of one another?
Keep in mind that the MergeContent has the potential to consume a lot of NiFi heap memory depending on how it is configured. FlowFile Attributes/metadata is held in heap memory. FlowFiles that are allocated to MergeContent bins have there FlowFiles held in heap memory. So depending on how many FlowFiles make up a typical bundle, the number of bundles being concurrently handled (200 max), and the amount + size of the individual FlowFile attributes, binned FlowFils can consume a lot of heap memory. Do you encounter any out of memory errors in yoru nifi-app.log (might not be thrown by mergeContent). You must be mindful of this when designing a dataflow that uses MergeContent processors.
Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped.
Thank you,
Matt