Created 08-29-2016 06:36 PM
Hi, I need to merge contents based on .CSV file headers. Lets say if i have 10 files in a folder and 5 of them with same header Name,Age,Gender.I want to merge all those 5 together and send rest to failures. How can i do that.?
Created 08-29-2016 06:47 PM
The mergeContent processor is not designed to look at the content of the NiFi FlowFiles it is merging. What you will want to do first is use a RouteOnContent processor to route only those Flowfiles where Content contains the headers you want to merge. The 'unmatched' FlowFiles could then be routed elsewhere or auto-terminated. Thanks,
Matt
Created 08-29-2016 06:47 PM
The mergeContent processor is not designed to look at the content of the NiFi FlowFiles it is merging. What you will want to do first is use a RouteOnContent processor to route only those Flowfiles where Content contains the headers you want to merge. The 'unmatched' FlowFiles could then be routed elsewhere or auto-terminated. Thanks,
Matt
Created on 08-29-2016 08:22 PM - edited 08-19-2019 03:11 AM
@mclark,
Ok , but RouteOnContent checks for the string in the whole file. where as i want to compare only the firstline .
if i have my RouteOnContent like below..it would route files to "Header" even if the data satisfies the RegEx.
Created 08-29-2016 09:01 PM
Your Regex above says the CSV file content must start with Tagname,Timestamp,Value,Quality,QualityDetail,PercentGood
So, it should not route to "Header" unless the CSV starts with that. What is found later in the CSV file should not matter. I tried this and it seems to work as expected. If i removed the '^', then all files matched.
Your processor is also loading 1 MB worth of the CSV content for evaluation; however, the string you are searching for is far fewer bytes. If you only want to match against the first line, reduce the size of the buffer from '1 MB' to maybe '60 b'. If I changed the buffer to '60 b' and removed the '^' from the regex above, only the files with the matching header were routed to "header". Thanks,
Matt