Created 07-12-2018 12:24 AM
Hi,
I have a number of flow files coming to the MergeContent processor to merge them into a zip file. However, some of the flow files have the same name which causes duplicate entry error. I want to add a counter value to their filename attribute in order to merge them (ex. A.txt (1), A.txt (2), etc.).
I used DetectDuplicate processor to separate the flow files with duplicate filename, not sure how to add a counter variable to their filename attribute. Can anyone give me an idea how to solve this?
Thanks.
Created on 07-12-2018 09:11 PM - edited 08-18-2019 01:49 AM
I just figured out the solution by using wait/notify processors pair. Each Notify processor will allow only one flowfile with duplicated filename. The UpdateAttribute will update a count variable in order for the Notify processor to send back to Wait processor.
Created on 07-12-2018 04:55 AM - edited 08-18-2019 01:49 AM
Feed the duplicate relation from DetectDuplicate processor to Update attribute processor with nextInt subject less function
Add new property as
filename
${filename}(${nextInt()})
By using above expression will add nextint to the filename
For more reference look into this link regarding nextInt() function usage.
(Or)
By storing state in UpdateAttribute processor
add new property as
theCount
${getStateValue("theCount"):plus(1)}
Use another update attribute processor to add theCount attribute to filename.
refer to this regarding getStateValue funtion usage.
add new property as
filename
${filename}(${theCount})
By using this approach you can reset your state value to 0 once it reaches to your threshhold value(like if value is 100 then set to 0 again) and refer to this link regarding reset the value.
Created 07-12-2018 07:25 PM
Hi @Shu,
Thanks for your suggest solution, but it doesn't work in my set up. I might have 100 flow files coming out of duplicate relationship of DetectDuplicate processor. 50 of them will have A.txt filename while the rest will be B.txt. The expected output would be A (1).txt, ..., A (50).txt and B (1).txt, .., B(50).txt. Since the number of flow files is not a fixed number, I can't really reset the state value. They all have the same ${segment.original.filename} value by the way. If there are another 10 flow files with A.txt coming out of DetectDuplicate processor with a different ${segment.original.filename} value, then these flow file should be named from 1 to 20.
Created on 07-12-2018 09:11 PM - edited 08-18-2019 01:49 AM
I just figured out the solution by using wait/notify processors pair. Each Notify processor will allow only one flowfile with duplicated filename. The UpdateAttribute will update a count variable in order for the Notify processor to send back to Wait processor.