Created 09-11-2016 12:54 PM
I'm using the merge content processor and have successfully been using the "Correlation Attribute Name" to bin together like files using a single flowfile attribute with expression language. I would like to start using two attributes to bin and merge files. Is this possible and any help on proper syntax for this would be appreciated.
Created 09-11-2016 04:02 PM
I don't think MergeContent supports multiple attribute names for the correlation attribute, but you could put an UpdateAttribute processor right before it and make a new attribute like:
combined = ${attribute1}_${attribute2}
and then use "combined" in MergeContent
Created 09-11-2016 04:02 PM
I don't think MergeContent supports multiple attribute names for the correlation attribute, but you could put an UpdateAttribute processor right before it and make a new attribute like:
combined = ${attribute1}_${attribute2}
and then use "combined" in MergeContent
Created 09-11-2016 07:34 PM
Thanks @Bryan Bende, I think this is a great solution. I'm currently using two attributes to help create directory structures dynamically in HDFS:
/tmp/data_staging/${SourceAttribute}/${data.date}
I was planning to get around the lack of multiple correlation attribute support by doing a route on attribute processor to different merge content processors then back to a single HDFS processor using the original attributes. With your suggestion I can do a much cleaner and more flexible solution. After merging the two attributes with an UpdateAttribute processor, I'll send the data to a MergeContent processor where I'll bin the files on the new combined attribute, and then to my putHDFS processor. The merged attribute will persist (e.g., dataSource1_20160911), which I can then do something like the following to continue dynamically creating directories:
/tmp/data_staging/${convergedSourceDateAttribute:substringBefore('_')}/${convergedSourceDateAttribute:substringAfter('_')}
Does that seem reasonable?