<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: How to parse w/ fixed width instead of char delimited contents? in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/359569#M238154</link>
    <description>&lt;P&gt;This is working fine. Can we provide&amp;nbsp;&lt;STRONG&gt;Search Value &lt;/STRONG&gt;and&amp;nbsp;&lt;STRONG&gt;&lt;SPAN&gt;Replacement Value &lt;/SPAN&gt;&lt;/STRONG&gt;as&lt;STRONG&gt;&lt;SPAN&gt; Variable or flowfile attribute. &lt;/SPAN&gt;&lt;/STRONG&gt;As I wanted to use same replace text processor to convert different input files with different number of columns. Basically I want to parameterised the&amp;nbsp;Search Value and&amp;nbsp;Replacement Value in replace text processor.&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/36110"&gt;@mpayne&lt;/a&gt;&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/45731"&gt;@ltsimps1&lt;/a&gt;&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/101473"&gt;@kpulagam&lt;/a&gt;&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/35852"&gt;@jpercivall&lt;/a&gt;&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/17198"&gt;@other&lt;/a&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 15 Dec 2022 13:24:37 GMT</pubDate>
    <dc:creator>Pawa</dc:creator>
    <dc:date>2022-12-15T13:24:37Z</dc:date>
    <item>
      <title>How to parse w/ fixed width instead of char delimited contents?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102597#M65532</link>
      <description>&lt;P&gt;I am trying to parse data from file contents that are generated by fixed width instead of by a delimiter.  As a simplified example, the value for data attribute 1 is in position 1-2, for attribute 2 is in position 3-6, and attribute 3 is in position 7-8 in each line.  Then, the file contents should be transformed as below. &lt;/P&gt;&lt;P&gt;&lt;U&gt;Before &lt;/U&gt;&lt;/P&gt;&lt;P&gt;AABBBBCC &lt;/P&gt;&lt;P&gt;DDEEEEFF &lt;/P&gt;&lt;P&gt;&lt;U&gt;After &lt;/U&gt;&lt;/P&gt;&lt;P&gt;AA;BBBB;CC &lt;/P&gt;&lt;P&gt;DD;EEEE;FF &lt;/P&gt;&lt;P&gt;I assume there may be a way to capture substrings per line? Please assist.&lt;/P&gt;</description>
      <pubDate>Fri, 15 Jan 2016 00:49:32 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102597#M65532</guid>
      <dc:creator>ltsimps1</dc:creator>
      <dc:date>2016-01-15T00:49:32Z</dc:date>
    </item>
    <item>
      <title>Re: How to parse w/ fixed width instead of char delimited contents?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102598#M65533</link>
      <description>&lt;P&gt;Kausha,&lt;/P&gt;&lt;P&gt;You can use ReplaceText to do this. In your example above, you can use a Replacement Strategy of "Regex Replace". &lt;/P&gt;&lt;P&gt;Set Evaluation Mode to "Line-by-Line"&lt;/P&gt;&lt;P&gt;The Search Value would then be:&lt;/P&gt;&lt;P&gt;(.{2})(.{4})(.{2})&lt;/P&gt;&lt;P&gt;And the Replacement Value would be:&lt;/P&gt;&lt;P&gt;$1;$2;$3&lt;/P&gt;&lt;P&gt;
Does that help?
&lt;/P&gt;</description>
      <pubDate>Fri, 15 Jan 2016 00:54:51 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102598#M65533</guid>
      <dc:creator>mpayne</dc:creator>
      <dc:date>2016-01-15T00:54:51Z</dc:date>
    </item>
    <item>
      <title>Re: How to parse w/ fixed width instead of char delimited contents?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102599#M65534</link>
      <description>&lt;P&gt;This is a bit tricky and will require a bit of regex magic. You will want to capture each length of data attribute (2, 4 and 2 respectively) into capture groups then use those capture groups to replace the content.&lt;/P&gt;&lt;P&gt;You'll use the ReplaceText processor with a search value of "(.{2})(.{4})(.{2})" and a replacement value of "$1;$2;$3" and configure it to evaluate line by line. This will go through the contents of the flowfile line by line and replace the contents like you want.&lt;/P&gt;&lt;P&gt;Comment below if you run into any problems!&lt;/P&gt;</description>
      <pubDate>Fri, 15 Jan 2016 00:57:08 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102599#M65534</guid>
      <dc:creator>jpercivall</dc:creator>
      <dc:date>2016-01-15T00:57:08Z</dc:date>
    </item>
    <item>
      <title>Re: How to parse w/ fixed width instead of char delimited contents?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102600#M65535</link>
      <description>&lt;P&gt;Assuming you are okay with using Hive for this, you would just create a table with one column (column name something like row) and then load the whole file into that table. Run a query to then split the columns and insert in another table. &lt;/P&gt;&lt;P&gt;Here are more details and code snippet. &lt;A href="https://martin.atlassian.net/wiki/pages/viewpage.action?pageId=21299205" target="_blank"&gt;https://martin.atlassian.net/wiki/pages/viewpage.action?pageId=21299205&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 15 Jan 2016 01:11:01 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102600#M65535</guid>
      <dc:creator>bsaini</dc:creator>
      <dc:date>2016-01-15T01:11:01Z</dc:date>
    </item>
    <item>
      <title>Re: How to parse w/ fixed width instead of char delimited contents?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102601#M65536</link>
      <description>&lt;P&gt;Yes. This works well, but is there a way to store the values as attributes.  Ultimately, I want to use the AttributesToJSON processor.&lt;/P&gt;</description>
      <pubDate>Fri, 15 Jan 2016 01:13:01 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102601#M65536</guid>
      <dc:creator>ltsimps1</dc:creator>
      <dc:date>2016-01-15T01:13:01Z</dc:date>
    </item>
    <item>
      <title>Re: How to parse w/ fixed width instead of char delimited contents?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102602#M65537</link>
      <description>&lt;P&gt;I have posted to  &lt;A rel="user" href="https://community.cloudera.com/users/367/mpayne.html" nodeid="367"&gt;@mpayne&lt;/A&gt; also: "Yes. This works well, but is there a way to store the values as attributes?" Ultimately, I want to use the AttributesToJSON processor.&lt;/P&gt;</description>
      <pubDate>Fri, 15 Jan 2016 01:14:33 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102602#M65537</guid>
      <dc:creator>ltsimps1</dc:creator>
      <dc:date>2016-01-15T01:14:33Z</dc:date>
    </item>
    <item>
      <title>Re: How to parse w/ fixed width instead of char delimited contents?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102603#M65538</link>
      <description>&lt;P&gt;I agree with &lt;A rel="user" href="https://community.cloudera.com/users/39/jpercivall.html" nodeid="39"&gt;@jpercivall&lt;/A&gt; and &lt;A rel="user" href="https://community.cloudera.com/users/367/mpayne.html" nodeid="367"&gt;@mpayne&lt;/A&gt; ReplaceText is the best way to go. I created a quick workflow that you can reference. This was assuming the input of AABBBBCC as you suggested. You can change the GetFile path and PutFile path, and the regex in ReplaceText to test with your real data.&lt;A href="https://community.cloudera.com/legacyfs/online/attachments/1374-fixedwidthexample.xml"&gt;fixedwidthexample.xml&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 15 Jan 2016 01:14:56 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102603#M65538</guid>
      <dc:creator>jdyer</dc:creator>
      <dc:date>2016-01-15T01:14:56Z</dc:date>
    </item>
    <item>
      <title>Re: How to parse w/ fixed width instead of char delimited contents?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102604#M65539</link>
      <description>&lt;P&gt;didnt realize the question was about nifi.. my bad. &lt;/P&gt;</description>
      <pubDate>Fri, 15 Jan 2016 01:16:21 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102604#M65539</guid>
      <dc:creator>bsaini</dc:creator>
      <dc:date>2016-01-15T01:16:21Z</dc:date>
    </item>
    <item>
      <title>Re: How to parse w/ fixed width instead of char delimited contents?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102605#M65540</link>
      <description>&lt;P&gt;Ah so you also want to extract the text to be attributes of the flowfile. Is the structure of the contents only ever two lines and do you want to create JSON using both of those lines or split them into separate flowfiles? &lt;/P&gt;</description>
      <pubDate>Fri, 15 Jan 2016 01:19:22 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102605#M65540</guid>
      <dc:creator>jpercivall</dc:creator>
      <dc:date>2016-01-15T01:19:22Z</dc:date>
    </item>
    <item>
      <title>Re: How to parse w/ fixed width instead of char delimited contents?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102606#M65541</link>
      <description>&lt;P&gt;The number of rows for the flow files will vary.  Each line of data will represent a record/item.  Also, I want the data output in its original file, not in separate flowfiles.&lt;/P&gt;</description>
      <pubDate>Fri, 15 Jan 2016 01:59:36 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102606#M65541</guid>
      <dc:creator>ltsimps1</dc:creator>
      <dc:date>2016-01-15T01:59:36Z</dc:date>
    </item>
    <item>
      <title>Re: How to parse w/ fixed width instead of char delimited contents?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102607#M65542</link>
      <description>&lt;P&gt;You can use the ExtractText processor and provide it a regex also in order to pull the values into attributes. For example, you could have:&lt;/P&gt;&lt;P&gt;field1: (.{2}).{6}&lt;/P&gt;&lt;P&gt;field2: .{2}(.{4}).{2}&lt;/P&gt;&lt;P&gt;field3: .{6}(.{2})&lt;/P&gt;&lt;P&gt;This assumes, though, that each FlowFile has only a single line. You could use SplitText, for example, to split each FlowFile into a separate line perhaps? I think we may need more context about what you're trying to accomplish to provide a more detailed answer.&lt;/P&gt;</description>
      <pubDate>Fri, 15 Jan 2016 02:04:37 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102607#M65542</guid>
      <dc:creator>mpayne</dc:creator>
      <dc:date>2016-01-15T02:04:37Z</dc:date>
    </item>
    <item>
      <title>Re: How to parse w/ fixed width instead of char delimited contents?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102608#M65543</link>
      <description>&lt;P&gt;What's your end goal JSON schema look like? What identifiers are each of the values going to use (for every line)? As a note, AttributesToJSON only creates flat JSON objects (no nested fields).&lt;/P&gt;</description>
      <pubDate>Fri, 15 Jan 2016 02:09:02 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102608#M65543</guid>
      <dc:creator>jpercivall</dc:creator>
      <dc:date>2016-01-15T02:09:02Z</dc:date>
    </item>
    <item>
      <title>Re: How to parse w/ fixed width instead of char delimited contents?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102609#M65544</link>
      <description>&lt;P&gt;I have assumed the following flow: GetFile --&amp;gt; ExtractText --&amp;gt; SplitText --&amp;gt; UpdateAttribute --&amp;gt; AttributesToJSON --&amp;gt; PutFile &lt;/P&gt;&lt;P&gt;I receive an error in PutFile.  Below are my modified configurations &lt;/P&gt;&lt;P&gt;ExtractText - Enable Multiline Mode = True &lt;/P&gt;&lt;P&gt;SplitText - Line Split Count = 1; Header Line Count = 1 &lt;/P&gt;&lt;P&gt;Update Attribute - Properties as suggested
Att1 = (.{2}).{6}; Att2 = .{2}(.{4}).{2}; Att3 = .{6}(.{2}) &lt;/P&gt;&lt;P&gt;AttributesToJSON
Attributes List = Att1, Att2, Att3&lt;/P&gt;&lt;P&gt;What am I missing here?&lt;/P&gt;</description>
      <pubDate>Fri, 15 Jan 2016 02:36:52 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102609#M65544</guid>
      <dc:creator>ltsimps1</dc:creator>
      <dc:date>2016-01-15T02:36:52Z</dc:date>
    </item>
    <item>
      <title>Re: How to parse w/ fixed width instead of char delimited contents?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102610#M65545</link>
      <description>&lt;P&gt;What error do you see in PutFile?&lt;/P&gt;</description>
      <pubDate>Fri, 15 Jan 2016 04:45:48 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102610#M65545</guid>
      <dc:creator>mpayne</dc:creator>
      <dc:date>2016-01-15T04:45:48Z</dc:date>
    </item>
    <item>
      <title>Re: How to parse w/ fixed width instead of char delimited contents?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102611#M65546</link>
      <description>&lt;P&gt;The output, based on this example would be JSON array&lt;/P&gt;&lt;P&gt;[{"field1":"AA","field2":"BBBB","field3":"CC"},{"field1":"DD","field2":"EEEE","field3":"FF"}]&lt;/P&gt;</description>
      <pubDate>Fri, 15 Jan 2016 10:07:02 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102611#M65546</guid>
      <dc:creator>ltsimps1</dc:creator>
      <dc:date>2016-01-15T10:07:02Z</dc:date>
    </item>
    <item>
      <title>Re: How to parse w/ fixed width instead of char delimited contents?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102612#M65547</link>
      <description>&lt;P&gt;I am able to run the flow when I set the ExtractText--&amp;gt;Splittext connection for matched and unmatched, but with incorrect output: {"Att3":" .{6}(.{2})","Att2":".{2}(.{4}).{2}","Att1":"(.{2}).{6}"}.&lt;/P&gt;&lt;P&gt;Would it be more efficient to use the ReplaceTextWithMapping processor?  I am unable to find a template with this processor and a relevant mapping file.&lt;/P&gt;</description>
      <pubDate>Fri, 15 Jan 2016 10:29:56 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102612#M65547</guid>
      <dc:creator>ltsimps1</dc:creator>
      <dc:date>2016-01-15T10:29:56Z</dc:date>
    </item>
    <item>
      <title>Re: How to parse w/ fixed width instead of char delimited contents?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102613#M65548</link>
      <description>&lt;P&gt;I have attempted, unsuccessfully, using the ReplaceText processor.  The method works if I have a small/set number of lines in my file.  &lt;/P&gt;&lt;P&gt;Do you have any guidance on the ReplaceTextWithMapping processor and how the mapping file should be formatted?&lt;/P&gt;</description>
      <pubDate>Fri, 15 Jan 2016 10:35:20 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102613#M65548</guid>
      <dc:creator>ltsimps1</dc:creator>
      <dc:date>2016-01-15T10:35:20Z</dc:date>
    </item>
    <item>
      <title>Re: How to parse w/ fixed width instead of char delimited contents?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102614#M65549</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I am not able to replicate same example. I am getting output as $1;$2;$3 all the times. I am new to NiFi and I and not able to find where I am missing. I think I have not used Properties correctly. &lt;/P&gt;</description>
      <pubDate>Tue, 09 May 2017 03:08:24 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/102614#M65549</guid>
      <dc:creator>other</dc:creator>
      <dc:date>2017-05-09T03:08:24Z</dc:date>
    </item>
    <item>
      <title>Re: How to parse w/ fixed width instead of char delimited contents?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/356414#M237290</link>
      <description>&lt;P&gt;Is it possible to output some part of the input text?&amp;nbsp;&lt;/P&gt;&lt;P&gt;For Example:&lt;/P&gt;&lt;P&gt;Input: AABBBBCC&lt;/P&gt;&lt;P&gt;Output: AA&lt;/P&gt;</description>
      <pubDate>Fri, 28 Oct 2022 21:58:40 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/356414#M237290</guid>
      <dc:creator>kpulagam</dc:creator>
      <dc:date>2022-10-28T21:58:40Z</dc:date>
    </item>
    <item>
      <title>Re: How to parse w/ fixed width instead of char delimited contents?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/359569#M238154</link>
      <description>&lt;P&gt;This is working fine. Can we provide&amp;nbsp;&lt;STRONG&gt;Search Value &lt;/STRONG&gt;and&amp;nbsp;&lt;STRONG&gt;&lt;SPAN&gt;Replacement Value &lt;/SPAN&gt;&lt;/STRONG&gt;as&lt;STRONG&gt;&lt;SPAN&gt; Variable or flowfile attribute. &lt;/SPAN&gt;&lt;/STRONG&gt;As I wanted to use same replace text processor to convert different input files with different number of columns. Basically I want to parameterised the&amp;nbsp;Search Value and&amp;nbsp;Replacement Value in replace text processor.&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/36110"&gt;@mpayne&lt;/a&gt;&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/45731"&gt;@ltsimps1&lt;/a&gt;&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/101473"&gt;@kpulagam&lt;/a&gt;&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/35852"&gt;@jpercivall&lt;/a&gt;&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/17198"&gt;@other&lt;/a&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 15 Dec 2022 13:24:37 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-parse-w-fixed-width-instead-of-char-delimited/m-p/359569#M238154</guid>
      <dc:creator>Pawa</dc:creator>
      <dc:date>2022-12-15T13:24:37Z</dc:date>
    </item>
  </channel>
</rss>

