- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
NiFi: Convert a proprietary ASCII based format to CSV
- Labels:
-
Apache NiFi
Created ‎04-15-2020 05:32 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My data is coming in text (ASCII) files, with each line having a fixed number of fields and each field having a fixed length. I want to convert this files to CSV, like this:
Input line:
abbccc
Output line:
a,bb,ccc
I the example above "a", "bb", "ccc" are fields of fixed length. So, I always know exactly how to split the input line.
I looked into the ConvertRecord operator and the ScriptedReader controller service (that can be used as a record reader), but I was not able to find any example of a Python script for ScriptedReader. I found this ExecuteScript Cookbook by @mburgess, but those recipes are much more general, so I cannot use them in SciptedReader (which needs very specific objects for record processing that must be created by the script).
Can anyone give a basic example of a Python script that can be used in ScriptedReader to process records?
Alternatively, is there another way to accomplish the task (another processor)? Of course, I can use ExecuteScript processor and script the processing of complete FlowFiles in it, but my FlowFiles contain millions of records and I think this processing will be much more inefficient than SciptedReader.
Created ‎04-15-2020 08:16 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can try the ReplaceText NiFi processor withe the approached described here. That will be a clean way of doing what you want without much scripting.
Created ‎04-15-2020 08:16 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can try the ReplaceText NiFi processor withe the approached described here. That will be a clean way of doing what you want without much scripting.
