- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Best way to parse Fixed width file using Nifi. Kindly help! @shu @matt burgess
- Labels:
-
Apache NiFi
Created ‎07-03-2018 10:56 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created on ‎07-03-2018 11:27 AM - edited ‎08-18-2019 02:41 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
To parse Fixed width file you can use Replace Text processor and keep the matching regex that extracts each field into a capture group then use some delimiter while replacing the data
as shown above if you know how many words that each field is going to be then capture each field into one capture group then replace the content with some delimiter. In addition if your file having space delimiter then use this (.*)\s(.*) regex and replace with some delimiter.
Change the ReplaceText evolution mode to line-by-line, Now we are reading fixed width file then replacing the contents of flowfile with some delimiter.
Once you have delimiter on field then you can use Convert Record processor to read and write the data in your required format.
If the file is couple of gigs then it's better to split the file to small chunks before ReplaceText processor then feed the splitted file to Replace Text processor.
In addition there is scripted reader/writer controller service in ConvertRecord processor which allows to read the incoming flowfile by using the script that you have given and writes the flowfile contents as per your ScriptedWriter controller service configured.
Some references regarding parsing fixed width file are here and here,
References regarding splitting big csv file into smaller chunks are here and here
Created ‎07-04-2018 07:07 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks @shu for timely help. Can you please help me around some docs with Nifi implementation at enterprise level especially clustered Nifi setup, parameters for performance & respositories set up in Oracle BDA.
Created ‎07-04-2018 08:41 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please refer to this and this links describes how to install NiFi as Service and this link to setup high performance NiFi.
