<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: NiFi: Convert a proprietary ASCII based format to CSV in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/NiFi-Convert-a-proprietary-ASCII-based-format-to-CSV/m-p/294055#M217024</link>
    <description>&lt;P&gt;You can try the ReplaceText NiFi processor withe the approached &lt;A href="https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/td-p/102597" target="_self"&gt;described here&lt;/A&gt;. That will be a clean way of doing what you want without much scripting.&lt;/P&gt;</description>
    <pubDate>Wed, 15 Apr 2020 15:16:34 GMT</pubDate>
    <dc:creator>aakulov</dc:creator>
    <dc:date>2020-04-15T15:16:34Z</dc:date>
    <item>
      <title>NiFi: Convert a proprietary ASCII based format to CSV</title>
      <link>https://community.cloudera.com/t5/Support-Questions/NiFi-Convert-a-proprietary-ASCII-based-format-to-CSV/m-p/294040#M217009</link>
      <description>&lt;P&gt;My data is coming in text (ASCII) files, with each line having a fixed number of fields and each field having a fixed length. I want to convert this files to CSV, like this:&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Input line:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;abbccc&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Output line:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;a,bb,ccc&lt;/P&gt;&lt;P&gt;I the example above "a", "bb", "ccc" are fields of fixed length. So, I always know exactly how to split the input line.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I looked into the ConvertRecord operator and the ScriptedReader controller service (that can be used as a record reader), but I was not able to find any example of a Python script for ScriptedReader. I found this &lt;A href="https://community.cloudera.com/t5/Community-Articles/ExecuteScript-Cookbook-part-1/ta-p/248922" target="_self"&gt;ExecuteScript Cookbook&lt;/A&gt; by &lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/38301"&gt;@mburgess&lt;/a&gt;, but those recipes are much more general, so I cannot use them in SciptedReader (which needs very specific objects for record processing that must be created by the script).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Can anyone give a basic example of a Python script that can be used in ScriptedReader to process records?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Alternatively, is there another way to accomplish the task (another processor)? Of course, I can use ExecuteScript processor and script the processing of complete FlowFiles in it, but my FlowFiles contain millions of records and I think this processing will be much more inefficient than SciptedReader.&lt;/P&gt;</description>
      <pubDate>Wed, 15 Apr 2020 12:32:34 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/NiFi-Convert-a-proprietary-ASCII-based-format-to-CSV/m-p/294040#M217009</guid>
      <dc:creator>SergiyK</dc:creator>
      <dc:date>2020-04-15T12:32:34Z</dc:date>
    </item>
    <item>
      <title>Re: NiFi: Convert a proprietary ASCII based format to CSV</title>
      <link>https://community.cloudera.com/t5/Support-Questions/NiFi-Convert-a-proprietary-ASCII-based-format-to-CSV/m-p/294055#M217024</link>
      <description>&lt;P&gt;You can try the ReplaceText NiFi processor withe the approached &lt;A href="https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/td-p/102597" target="_self"&gt;described here&lt;/A&gt;. That will be a clean way of doing what you want without much scripting.&lt;/P&gt;</description>
      <pubDate>Wed, 15 Apr 2020 15:16:34 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/NiFi-Convert-a-proprietary-ASCII-based-format-to-CSV/m-p/294055#M217024</guid>
      <dc:creator>aakulov</dc:creator>
      <dc:date>2020-04-15T15:16:34Z</dc:date>
    </item>
  </channel>
</rss>

