Created 01-23-2017 08:33 PM
Hello, I am using Nifi 1.0.0 and am trying to merge records from an ExecuteSql processor using MergeContent.
I wanted to try Defrag merge strategy and have the following setup in an upstream UpdateAttribute processor for each flow file:
1. fragment.identifier - mmddyy of the flow file
2. fragment.index - nextInt()
3. fragment.count - executesql.row.count
4. segment.original.filename - filename
When i run the workflow - i get this error :
Cannot Defragment FlowFiles with Fragment Identifier because the expected number of fragments is <sql record count> but found only 1 fragments. It seems like MergeContent is trying to merge too soon - appreciate any advice.
My workflow is
ExecuteSql -> SplitAvro -> UpdateAttribute (adds fragment.* attribute - could not see these on SplitAvro even though doc indicates it should be present) -> ConvertAvroToJson -> EvaluateJsonPath (to extract only some sql columns) -> ReplaceText(for conversion to comma delimited) -> MergeContent -> PutFile
NOTE: I got inconsistent file lengths when trying out various MergeContent Bin-packing configurations so turned to Defrag.
thanks
Created 01-23-2017 09:51 PM
The defragment mode of MergeContent is meant to work with upstream processors that have "fragmented" a flow file and produce the standard fragment attributes (fragment.identifier, fragment.index, fragment.count). In your example, SplitAvro is one of those processors that takes a flow file and fragments its content, but it didn't originally produce the fragment attributes . It was updated in Apache NiFi 1.1.0 (https://issues.apache.org/jira/browse/NIFI-2805) to add the fragment attributes, so if you upgrade then you should see them.
Created 01-25-2017 10:51 PM
I think you need to remove the value for "Maximum Number of Entries", in your screenshot is set to 1000 which means it would attempt to merge at 1000 before seeing all 325070 fragments. Just leave it blank.
Created 01-26-2017 03:50 AM
I am unable to set blank for max since MergeContent complains it is not a valid integer. Also set min and max to the same value 325070 as well as 1/325070 but still get the fragment error.
Created 01-26-2017 01:56 PM
Can you post a template of your flow (the XML file from exporting a template)?
I don't think I could help much more without seeing your exact flow.
https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#templates
Created 01-26-2017 03:28 PM
GM - The template is attached. Appreciate your help!hcc-mergecontent-issue-support.xml
Created 01-26-2017 05:24 PM
I'm wondering if some records are not making through the processors between SplitAvro and MergeContent, most likely it would be at EvaluateJsonPath. Can you try running this updated template and see if any flow files go to any of the LogAttribute processors? hcc-mergecontent-issue-support-updated.xml
Created 01-27-2017 01:46 AM
I don't see anything flowing into LogAttribute - snapshot of flow attached.screen-shot-2017-01-26-at-84457-pm.png
Created 01-27-2017 01:48 AM
I reran with the LogAttribute processor, but did not see any flow files going into it. screen-shot-2017-01-26-at-84457-pm.png
Created 02-13-2022 11:53 AM