- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Apache Nifi - Transform Fixed Width File into Delimited File?
- Labels:
-
Apache NiFi
Created ‎10-25-2021 05:53 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello!
I'm new to Nifi and struggling with what seems like a simple task.
I have a file with Header, Body and Trailer in fixed width layout.
The header record always starts with an A;
The Trailer record always starts with a Z;
Body records are identified by B, C, D and so on... (except A and Z)
So I first used Route Text to separate Header, Body and Trailer, because my goal here was to replace the fixed width with a delimiter (;) so I can make sense of the info in the file (header, body and trailer will have a different number of columns when separated by a delimiter).
Then I used ReplaceText with a regex to create columns delimited by semi colon instead of fixed width.
Now I need to regroup the rows and create a single file again, with the header, body and trailer, but this time all separated by semi colons. This is what I'd like to achieve:
My template looks like this:
Is that possible? I tried using MergeRecord for that but I really don't know how to configure its properties, and it's not merging anything.
Created ‎11-02-2021 08:23 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@AnnaBea
Let me make sure I am clear on your ask here:
1. You have successfully split your source file in to 3 parts (header line, body line(s), and footer line).
2. You have successfully modified all three split files as needed.
3. You are having issues re-assembling the three split files back in to one file in order of header, body, footer using MergeRecord processor?
With this particular dataflow design, the MergeRecord processor is not likely what you want to use. You probably want to be using the MergeContent processor instead with a "Merge Strategy" of "Defragment". But to get these three source FlowFiles merged in a specific order would require some additional work in your upstream flow. In order to use "Defragment" your three source FlowFiles all would need o have these FlowFile Attributes:
fragment.identifier | All split FlowFiles produced from the same parent FlowFile will have the same randomly generated UUID added for this attribute |
fragment.index | A one-up number that indicates the ordering of the split FlowFiles that were created from a single parent FlowFile |
fragment.count | The number of split FlowFiles generated from the parent FlowFile |
1. Add one UpdateAttribute processor before your RouteText and configure it to create the "fragement.identifier" attribute with a value of "${UUID()}" and another Attribute "Fragment.count" with a value of "3". Each FlowFIle produced by RouteText should then have these two attribute set on it.
2. Then add one UpdateAttribute processor to each of teh 3 flow paths to set the "fragment.index" attribute uniquely per each dataflow path. value=1 for header, value=2 for body, and value=3 for footer.
3. Now the MergeContent will have what it needs to bin these three files by the UUID and merge them in the proper order.
There are often times many ways to solve the same use case using NiFi components. Some design choices are better than others and use less resources to accomplish the end goal.
While above is one solution, there are others I am sure. Cloudera's professional services is a great resource that can help with use case designs.
If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post.
Thank you,
Matt
