Support Questions
Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

Using the MergeContent processor in Nifi, can I use more than one correlation attribute name?

Explorer

I'm using the merge content processor and have successfully been using the "Correlation Attribute Name" to bin together like files using a single flowfile attribute with expression language. I would like to start using two attributes to bin and merge files. Is this possible and any help on proper syntax for this would be appreciated.

1 ACCEPTED SOLUTION

I don't think MergeContent supports multiple attribute names for the correlation attribute, but you could put an UpdateAttribute processor right before it and make a new attribute like:

combined = ${attribute1}_${attribute2}

and then use "combined" in MergeContent

View solution in original post

2 REPLIES 2

I don't think MergeContent supports multiple attribute names for the correlation attribute, but you could put an UpdateAttribute processor right before it and make a new attribute like:

combined = ${attribute1}_${attribute2}

and then use "combined" in MergeContent

Explorer

Thanks @Bryan Bende, I think this is a great solution. I'm currently using two attributes to help create directory structures dynamically in HDFS:

/tmp/data_staging/${SourceAttribute}/${data.date}

I was planning to get around the lack of multiple correlation attribute support by doing a route on attribute processor to different merge content processors then back to a single HDFS processor using the original attributes. With your suggestion I can do a much cleaner and more flexible solution. After merging the two attributes with an UpdateAttribute processor, I'll send the data to a MergeContent processor where I'll bin the files on the new combined attribute, and then to my putHDFS processor. The merged attribute will persist (e.g., dataSource1_20160911), which I can then do something like the following to continue dynamically creating directories:

/tmp/data_staging/${convergedSourceDateAttribute:substringBefore('_')}/${convergedSourceDateAttribute:substringAfter('_')}

Does that seem reasonable?