Support Questions

Find answers, ask questions, and share your expertise

Using the MergeContent processor in Nifi, can I use more than one correlation attribute name?

avatar
Contributor

I'm using the merge content processor and have successfully been using the "Correlation Attribute Name" to bin together like files using a single flowfile attribute with expression language. I would like to start using two attributes to bin and merge files. Is this possible and any help on proper syntax for this would be appreciated.

1 ACCEPTED SOLUTION

avatar
Master Guru

I don't think MergeContent supports multiple attribute names for the correlation attribute, but you could put an UpdateAttribute processor right before it and make a new attribute like:

combined = ${attribute1}_${attribute2}

and then use "combined" in MergeContent

View solution in original post

2 REPLIES 2

avatar
Master Guru

I don't think MergeContent supports multiple attribute names for the correlation attribute, but you could put an UpdateAttribute processor right before it and make a new attribute like:

combined = ${attribute1}_${attribute2}

and then use "combined" in MergeContent

avatar
Contributor

Thanks @Bryan Bende, I think this is a great solution. I'm currently using two attributes to help create directory structures dynamically in HDFS:

/tmp/data_staging/${SourceAttribute}/${data.date}

I was planning to get around the lack of multiple correlation attribute support by doing a route on attribute processor to different merge content processors then back to a single HDFS processor using the original attributes. With your suggestion I can do a much cleaner and more flexible solution. After merging the two attributes with an UpdateAttribute processor, I'll send the data to a MergeContent processor where I'll bin the files on the new combined attribute, and then to my putHDFS processor. The merged attribute will persist (e.g., dataSource1_20160911), which I can then do something like the following to continue dynamically creating directories:

/tmp/data_staging/${convergedSourceDateAttribute:substringBefore('_')}/${convergedSourceDateAttribute:substringAfter('_')}

Does that seem reasonable?