Support Questions

Find answers, ask questions, and share your expertise

Best way to parse Fixed width file using Nifi. Kindly help! @shu @matt burgess

avatar
 
3 REPLIES 3

avatar
Master Guru
@Vengai Magan

To parse Fixed width file you can use Replace Text processor and keep the matching regex that extracts each field into a capture group then use some delimiter while replacing the data

80383-fixed-width.png

as shown above if you know how many words that each field is going to be then capture each field into one capture group then replace the content with some delimiter. In addition if your file having space delimiter then use this (.*)\s(.*) regex and replace with some delimiter.

Change the ReplaceText evolution mode to line-by-line, Now we are reading fixed width file then replacing the contents of flowfile with some delimiter.

Once you have delimiter on field then you can use Convert Record processor to read and write the data in your required format.

If the file is couple of gigs then it's better to split the file to small chunks before ReplaceText processor then feed the splitted file to Replace Text processor.

In addition there is scripted reader/writer controller service in ConvertRecord processor which allows to read the incoming flowfile by using the script that you have given and writes the flowfile contents as per your ScriptedWriter controller service configured.

Some references regarding parsing fixed width file are here and here,

References regarding splitting big csv file into smaller chunks are here and here

avatar

Thanks @shu for timely help. Can you please help me around some docs with Nifi implementation at enterprise level especially clustered Nifi setup, parameters for performance & respositories set up in Oracle BDA.

avatar
Master Guru
@Vengai Magan

Please refer to this and this links describes how to install NiFi as Service and this link to setup high performance NiFi.