Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

NiFi: Merge Header and Data coming from different FlowFiles

Solved Go to solution
Highlighted

NiFi: Merge Header and Data coming from different FlowFiles

@Shu @Matt Burgess

Hi,

We have 2 files coming from different locations.

1) First one is coming containing Header (column names only)

2) Second one is having Data, in the same column sequence.

Aim:

We need to merge both in one output file, where Header comes on top (in first row) and data from 2 row on wards.

Looking forward.

Cheers

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: NiFi: Merge Header and Data coming from different FlowFiles

Super Guru

@Mustafa Ali Qizilbash

You can acheive this case by using MergeContent processor

Merge Content Configs:

91415-mergecontent.png

By configuring Minimum Number of Entries to 2 processor will wait until it got 2 entries.

Flow that i tried:

91412-flow.png

But if we got 2 flowfiles from Location1 it self merge content is going to merge those flowfiles into 1.

This flow only works when we are going to have one flowfile from each source then it works fine, if you haven't got any flowfile from location2 then processor just wait infinite time until it gets another flowfile.

To avoid this case use reasonable Max bin age time for your use case then processor will forcefully keeps the flow file into merged relationship.

Please refer to this link for configuring MergeContent processor.

(or)

If your header is always same:

1.With new record oriented processor capabilities you can ignore the header that is coming from Location1 and configure the ConvertRecord processor to add the header to the incoming data.

2.Using Replace text processor we can add the header from to the Location2 file.

Refer to this link for more details regards to this method.

View solution in original post

6 REPLIES 6
Highlighted

Re: NiFi: Merge Header and Data coming from different FlowFiles

Super Guru

@Mustafa Ali Qizilbash

You can acheive this case by using MergeContent processor

Merge Content Configs:

91415-mergecontent.png

By configuring Minimum Number of Entries to 2 processor will wait until it got 2 entries.

Flow that i tried:

91412-flow.png

But if we got 2 flowfiles from Location1 it self merge content is going to merge those flowfiles into 1.

This flow only works when we are going to have one flowfile from each source then it works fine, if you haven't got any flowfile from location2 then processor just wait infinite time until it gets another flowfile.

To avoid this case use reasonable Max bin age time for your use case then processor will forcefully keeps the flow file into merged relationship.

Please refer to this link for configuring MergeContent processor.

(or)

If your header is always same:

1.With new record oriented processor capabilities you can ignore the header that is coming from Location1 and configure the ConvertRecord processor to add the header to the incoming data.

2.Using Replace text processor we can add the header from to the Location2 file.

Refer to this link for more details regards to this method.

View solution in original post

Highlighted

Re: NiFi: Merge Header and Data coming from different FlowFiles

Super Guru
@Mustafa Ali Qizilbash

For this case using EnforceOrder processor we can achieve your required file.

FLow:

91417-flow.png

In this flow EnforceOrder processor enforce to get header flowfile first then actual data flowfile and using MergeContent processor we are merging them into one.

Change the Wait Timeout property value in EnforceOrder processor as pre your requirement.

I have attached the template xml in this thread,you can keep as reference for your flow.

enforce-order.xml

Highlighted

Re: NiFi: Merge Header and Data coming from different FlowFiles

Super Guru

@Mustafa Ali Qizilbash

I think issue is with Order Attribute as this property doesn't accept expression language so use attribute name without expression language.

Make sure you are having enforce order configs as

91424-eo.png

Change the Wait Timeout property value in EnforceOrder processor as pre your requirement.

Highlighted

Re: NiFi: Merge Header and Data coming from different FlowFiles

@Shu

Thanks, the merged worked but header came as a last row.

How to prioritize the header FlowFile to come top as data header (as column names)

Re: NiFi: Merge Header and Data coming from different FlowFiles

@Shu

Followed and implemented but getting the error at the same place where you are showing in your snapshot. Kindly advice how to fix.

91419-enforceorderfailed.png

UpdateAttribute_Header

91421-updateattribute-header.png

UpdateAttribute_Data

91423-updateattribute-data.png

Highlighted

Re: NiFi: Merge Header and Data coming from different FlowFiles

Many thanks, it worked.

Don't have an account?
Coming from Hortonworks? Activate your account here