Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to perform equijoin on files while merging them?

How to perform equijoin on files while merging them?

Hi Team,

I am able to merge n number of files using MergeText processor but actually they are appending one after the other.

Is it feasible to join the data or merge two files on the basis of one common header. for eg. I have uuid column in one file with 5 different cols and the same column in other file with 6 more different col, can i join here both the files and get 12 cols dataset in output file(having one common comparison column).

We can treat this case as an equijoin condition.

Thanks in advance!

-Garima.

1 REPLY 1
Highlighted

Re: How to perform equijoin on files while merging them?

Super Guru

I have not tried this yet but look into Correlation Attribute Name in the mergecontent processor

If specified, like FlowFiles will be binned together, where 'like FlowFiles' means FlowFiles that have the same value for this Attribute. If not specified, FlowFiles are bundled by the order in which they are pulled from the queue. Supports Expression Language: true

Don't have an account?
Coming from Hortonworks? Activate your account here