<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Nifi Extraction in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Nifi-Extraction/m-p/179170#M141416</link>
    <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/14397/regiecanada.html" nodeid="14397" target="_blank"&gt;@regie canada&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Does each request in the file always start with&lt;/P&gt;&lt;PRE&gt;####################################################################### START of Request #######################################################################&lt;/PRE&gt;&lt;P&gt;and end with:&lt;/P&gt;&lt;PRE&gt;####################################################################### END of Request #######################################################################&lt;/PRE&gt;&lt;P&gt;If so, you could use the SplitContent processor to split your incoming FlowFile in multiple FlowFiles (each with a single request).  Then you could parse each of those requests for the lines/values you want.&lt;/P&gt;&lt;P&gt;The SplitContent processor would be configured as follows in this scenario:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="15676-screen-shot-2017-05-23-at-73841-am.png" style="width: 1313px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/19409iE568C1691DAF3402/image-size/medium?v=v2&amp;amp;px=400" role="button" title="15676-screen-shot-2017-05-23-at-73841-am.png" alt="15676-screen-shot-2017-05-23-at-73841-am.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Do the 4 lines you want to extract the values from have a specific format?&lt;/P&gt;&lt;P&gt;For example do they actually start with "Line" or is that property name always dynamic in nature?&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Matt&lt;/P&gt;</description>
    <pubDate>Sun, 18 Aug 2019 09:24:05 GMT</pubDate>
    <dc:creator>MattWho</dc:creator>
    <dc:date>2019-08-18T09:24:05Z</dc:date>
    <item>
      <title>Nifi Extraction</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Nifi-Extraction/m-p/179166#M141412</link>
      <description>&lt;P&gt;Hi guys,&lt;/P&gt;&lt;P&gt;i would like to ask, if this is possible in nifi alone without using any script execution.&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.cloudera.com/legacyfs/online/attachments/15588-test.txt"&gt;test.txt&lt;/A&gt; &amp;lt;--- so this file of mine need to be extracted. Please see the attached file.&lt;/P&gt;&lt;P&gt; I just need to get the &lt;STRONG&gt;VALUE of line1 to line4 and save it to hbase.&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;The problem here is it has multiple request in 1 file. I need to get all the line1 to line4 per request.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;PS. the count of lines per requests are different. My file is just an example of what the file looks like.&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Thank you.&lt;/P&gt;</description>
      <pubDate>Fri, 19 May 2017 13:27:33 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Nifi-Extraction/m-p/179166#M141412</guid>
      <dc:creator>regie_canada</dc:creator>
      <dc:date>2017-05-19T13:27:33Z</dc:date>
    </item>
    <item>
      <title>Re: Nifi Extraction</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Nifi-Extraction/m-p/179167#M141413</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/14397/regiecanada.html" nodeid="14397" target="_blank"&gt;@regie canada&lt;/A&gt;&lt;/P&gt;&lt;P&gt;You can use the &lt;STRONG&gt;ExtractText&lt;/STRONG&gt; processor and use regex within it to pull the first 4 lines.  Your regex would be:&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;(.*)\n(.*)\n(.*)\n(.*)&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="15596-screen-shot-2017-05-19-at-25904-pm.png" style="width: 1754px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/19410iA3A10FCA41DEB31C/image-size/medium?v=v2&amp;amp;px=400" role="button" title="15596-screen-shot-2017-05-19-at-25904-pm.png" alt="15596-screen-shot-2017-05-19-at-25904-pm.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;After that you can use the &lt;STRONG&gt;SplitText&lt;/STRONG&gt; processor if you want each line to be an individual flowfile or you can use the &lt;STRONG&gt;UpdateAttribute&lt;/STRONG&gt; processor to make any kind of transformations on the 4 lines.&lt;/P&gt;</description>
      <pubDate>Sun, 18 Aug 2019 09:24:13 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Nifi-Extraction/m-p/179167#M141413</guid>
      <dc:creator>egarelnabi</dc:creator>
      <dc:date>2019-08-18T09:24:13Z</dc:date>
    </item>
    <item>
      <title>Re: Nifi Extraction</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Nifi-Extraction/m-p/179168#M141414</link>
      <description>&lt;P&gt;Hi sir, thanks for the reply. I need all the first 4 lines in every ####################################################################### START of Request ####################################################################### sir. it has a multiple request in every file.&lt;/P&gt;</description>
      <pubDate>Tue, 23 May 2017 10:20:35 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Nifi-Extraction/m-p/179168#M141414</guid>
      <dc:creator>regie_canada</dc:creator>
      <dc:date>2017-05-23T10:20:35Z</dc:date>
    </item>
    <item>
      <title>Re: Nifi Extraction</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Nifi-Extraction/m-p/179169#M141415</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/445/egarelnabi.html" nodeid="445"&gt;@Eyad Garelnabi&lt;/A&gt; &lt;/P&gt;</description>
      <pubDate>Tue, 23 May 2017 14:46:12 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Nifi-Extraction/m-p/179169#M141415</guid>
      <dc:creator>regie_canada</dc:creator>
      <dc:date>2017-05-23T14:46:12Z</dc:date>
    </item>
    <item>
      <title>Re: Nifi Extraction</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Nifi-Extraction/m-p/179170#M141416</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/14397/regiecanada.html" nodeid="14397" target="_blank"&gt;@regie canada&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Does each request in the file always start with&lt;/P&gt;&lt;PRE&gt;####################################################################### START of Request #######################################################################&lt;/PRE&gt;&lt;P&gt;and end with:&lt;/P&gt;&lt;PRE&gt;####################################################################### END of Request #######################################################################&lt;/PRE&gt;&lt;P&gt;If so, you could use the SplitContent processor to split your incoming FlowFile in multiple FlowFiles (each with a single request).  Then you could parse each of those requests for the lines/values you want.&lt;/P&gt;&lt;P&gt;The SplitContent processor would be configured as follows in this scenario:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="15676-screen-shot-2017-05-23-at-73841-am.png" style="width: 1313px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/19409iE568C1691DAF3402/image-size/medium?v=v2&amp;amp;px=400" role="button" title="15676-screen-shot-2017-05-23-at-73841-am.png" alt="15676-screen-shot-2017-05-23-at-73841-am.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Do the 4 lines you want to extract the values from have a specific format?&lt;/P&gt;&lt;P&gt;For example do they actually start with "Line" or is that property name always dynamic in nature?&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Matt&lt;/P&gt;</description>
      <pubDate>Sun, 18 Aug 2019 09:24:05 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Nifi-Extraction/m-p/179170#M141416</guid>
      <dc:creator>MattWho</dc:creator>
      <dc:date>2019-08-18T09:24:05Z</dc:date>
    </item>
    <item>
      <title>Re: Nifi Extraction</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Nifi-Extraction/m-p/179171#M141417</link>
      <description>&lt;P&gt;&lt;A href="https://community.hortonworks.com/questions/103653/nifi-extraction.html#"&gt;@regie canada&lt;/A&gt;&lt;/P&gt;&lt;P&gt;As Matt suggested below, use the &lt;STRONG&gt;SplitContent &lt;/STRONG&gt;processor to split the file into multiple, smaller flow files.  The "byte sequence" entry for splitting would be &lt;/P&gt;&lt;P&gt;####################################################################### START of Request #######################################################################&lt;/P&gt;&lt;P&gt;After that, use the &lt;STRONG&gt;ExtractText &lt;/STRONG&gt;processor, as described in my response above, to get the first 4 lines of each flow file generated by the &lt;STRONG&gt;SplitContent &lt;/STRONG&gt;processor.&lt;/P&gt;</description>
      <pubDate>Wed, 24 May 2017 00:20:10 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Nifi-Extraction/m-p/179171#M141417</guid>
      <dc:creator>egarelnabi</dc:creator>
      <dc:date>2017-05-24T00:20:10Z</dc:date>
    </item>
    <item>
      <title>Re: Nifi Extraction</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Nifi-Extraction/m-p/179172#M141418</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/14397/regiecanada.html" nodeid="14397"&gt;@regie canada&lt;/A&gt; &lt;/P&gt;&lt;P&gt;I agree. My answer was only intended to show how to split you multi-record file in to single records to be processed similar to &lt;A rel="user" href="https://community.cloudera.com/users/445/egarelnabi.html" nodeid="445"&gt;@Eyad Garelnabi&lt;/A&gt; suggested approach.&lt;/P&gt;&lt;P&gt;Matt&lt;/P&gt;</description>
      <pubDate>Wed, 24 May 2017 00:25:00 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Nifi-Extraction/m-p/179172#M141418</guid>
      <dc:creator>MattWho</dc:creator>
      <dc:date>2017-05-24T00:25:00Z</dc:date>
    </item>
    <item>
      <title>Re: Nifi Extraction</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Nifi-Extraction/m-p/179173#M141419</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/525/mclark.html" nodeid="525"&gt;@Matt Clarke&lt;/A&gt; &lt;A rel="user" href="https://community.cloudera.com/users/445/egarelnabi.html" nodeid="445"&gt;@Eyad Garelnabi&lt;/A&gt; &lt;/P&gt;&lt;P&gt;Thank you so much!! &lt;/P&gt;</description>
      <pubDate>Wed, 24 May 2017 12:13:53 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Nifi-Extraction/m-p/179173#M141419</guid>
      <dc:creator>regie_canada</dc:creator>
      <dc:date>2017-05-24T12:13:53Z</dc:date>
    </item>
    <item>
      <title>Re: Nifi Extraction</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Nifi-Extraction/m-p/179174#M141420</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/525/mclark.html" nodeid="525"&gt;@Matt Clarke&lt;/A&gt; &lt;/P&gt;&lt;P&gt;Thanks sir.  Anyway for addition question, is there a processor that will change this 
Line1 : value
Line2 : value
Line3 : valu
Line4: value&lt;/P&gt;&lt;P&gt;to JSON format?&lt;/P&gt;&lt;P&gt;Thank again.&lt;/P&gt;</description>
      <pubDate>Wed, 24 May 2017 14:13:29 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Nifi-Extraction/m-p/179174#M141420</guid>
      <dc:creator>regie_canada</dc:creator>
      <dc:date>2017-05-24T14:13:29Z</dc:date>
    </item>
    <item>
      <title>Re: Nifi Extraction</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Nifi-Extraction/m-p/179175#M141421</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/14397/regiecanada.html" nodeid="14397"&gt;@regie canada&lt;/A&gt;&lt;P&gt;The extractText processor creates FlowFile attributes from the extracted text.  NiFi has an AttributesToJSON processor you can use to generate JSON form these created attributes.&lt;/P&gt;&lt;P&gt;For new questions, please open a new question.  It makes it easier for community users to search for answers.&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Matt&lt;/P&gt;</description>
      <pubDate>Wed, 24 May 2017 19:06:03 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Nifi-Extraction/m-p/179175#M141421</guid>
      <dc:creator>MattWho</dc:creator>
      <dc:date>2017-05-24T19:06:03Z</dc:date>
    </item>
  </channel>
</rss>

