<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Parse file in Nifi ? in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Parse-file-in-Nifi/m-p/188650#M150743</link>
    <description>&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="48405-screenshot-from-2018-01-12-100503.png" style="width: 1366px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/18627i7B4AA2B1170881C1/image-size/medium?v=v2&amp;amp;px=400" role="button" title="48405-screenshot-from-2018-01-12-100503.png" alt="48405-screenshot-from-2018-01-12-100503.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/18929/yaswanthmuppireddy.html" nodeid="18929" target="_blank"&gt;@Shu&lt;/A&gt; my input is like this, so now i want to parse these data according to above which i mentioned.&lt;/P&gt;&lt;P&gt;Thanks !&lt;/P&gt;</description>
    <pubDate>Sun, 18 Aug 2019 07:51:11 GMT</pubDate>
    <dc:creator>sshringi003</dc:creator>
    <dc:date>2019-08-18T07:51:11Z</dc:date>
    <item>
      <title>Parse file in Nifi ?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Parse-file-in-Nifi/m-p/188649#M150742</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I am getting error while parsing CSV formatted JSON file in NiFi. My file file column like...&lt;/P&gt;&lt;P&gt;Name : Surendra &lt;/P&gt;&lt;P&gt;Age : 24&lt;/P&gt;&lt;P&gt;Address : {"city":"Chennai","state":"TN","zipcode":"600345"}&lt;/P&gt;&lt;P&gt;Now output should be like this..&lt;/P&gt;&lt;P&gt;Name : Surendra &lt;/P&gt;&lt;P&gt;Age : 24&lt;/P&gt;&lt;P&gt;Address_city : Chennai&lt;/P&gt;&lt;P&gt;Address_state : TN&lt;/P&gt;&lt;P&gt;Address_zipcode : 600345&lt;/P&gt;&lt;P&gt;Pls can anyone help me regarding the same.&lt;/P&gt;</description>
      <pubDate>Fri, 12 Jan 2018 10:31:56 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Parse-file-in-Nifi/m-p/188649#M150742</guid>
      <dc:creator>sshringi003</dc:creator>
      <dc:date>2018-01-12T10:31:56Z</dc:date>
    </item>
    <item>
      <title>Re: Parse file in Nifi ?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Parse-file-in-Nifi/m-p/188650#M150743</link>
      <description>&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="48405-screenshot-from-2018-01-12-100503.png" style="width: 1366px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/18627i7B4AA2B1170881C1/image-size/medium?v=v2&amp;amp;px=400" role="button" title="48405-screenshot-from-2018-01-12-100503.png" alt="48405-screenshot-from-2018-01-12-100503.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/18929/yaswanthmuppireddy.html" nodeid="18929" target="_blank"&gt;@Shu&lt;/A&gt; my input is like this, so now i want to parse these data according to above which i mentioned.&lt;/P&gt;&lt;P&gt;Thanks !&lt;/P&gt;</description>
      <pubDate>Sun, 18 Aug 2019 07:51:11 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Parse-file-in-Nifi/m-p/188650#M150743</guid>
      <dc:creator>sshringi003</dc:creator>
      <dc:date>2019-08-18T07:51:11Z</dc:date>
    </item>
    <item>
      <title>Re: Parse file in Nifi ?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Parse-file-in-Nifi/m-p/188651#M150744</link>
      <description>&lt;P&gt;I want to fetch this data from Mysql so i created a table name as input in Mysql.&lt;/P&gt;&lt;P&gt;And my flow like this ExecuteSQL -&amp;gt;&amp;gt; SplitAvro -&amp;gt;&amp;gt; ConvertAvroToJson -&amp;gt;&amp;gt; EvaluateJsonPath -&amp;gt;&amp;gt; UpdateAttribute &lt;/P&gt;</description>
      <pubDate>Fri, 12 Jan 2018 12:46:48 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Parse-file-in-Nifi/m-p/188651#M150744</guid>
      <dc:creator>sshringi003</dc:creator>
      <dc:date>2018-01-12T12:46:48Z</dc:date>
    </item>
    <item>
      <title>Re: Parse file in Nifi ?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Parse-file-in-Nifi/m-p/188652#M150745</link>
      <description>&lt;P&gt;Thanks @shu for your reply ... and I am looking for the same output which you send me. I want the output in CSV file like Surendra,24,Chennai,TN,24. And output will stored in local machine only.&lt;/P&gt;</description>
      <pubDate>Sat, 13 Jan 2018 11:42:32 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Parse-file-in-Nifi/m-p/188652#M150745</guid>
      <dc:creator>sshringi003</dc:creator>
      <dc:date>2018-01-13T11:42:32Z</dc:date>
    </item>
    <item>
      <title>Re: Parse file in Nifi ?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Parse-file-in-Nifi/m-p/188653#M150746</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/52190/sshringi003.html" nodeid="52190" target="_blank"&gt;@Surendra Shringi&lt;/A&gt;&lt;P&gt;We can do this parsing inside NiFi by using&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Example:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Let's consider your &lt;STRONG&gt;csv&lt;/STRONG&gt; file having n number of rows in it&lt;/P&gt;&lt;PRE&gt;Surendra,24,"{"city":"Chennai","state":"TN","zipcode":"600345"}"
Surendra,25,"{"city":"Chennai","state":"TN","zipcode":"609345"}"&lt;/PRE&gt;&lt;P&gt;We need to split this file into individual flowfile having each record in one flowfile for splitting we need to use&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;SplitText:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;processor with below configs as&lt;/P&gt;&lt;P&gt;Line Split Count
&lt;/P&gt;&lt;PRE&gt;1&lt;/PRE&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="49399-splittext.png" style="width: 1387px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/18623i243F0042EE429F77/image-size/medium?v=v2&amp;amp;px=400" role="button" title="49399-splittext.png" alt="49399-splittext.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;So if our input csv having 2 lines in it then split text processor will split the input file having 2 lines into 2 flowfiles having each line in one flowfile.&lt;/P&gt;&lt;P&gt;Once we are having each record in one flowfile then we need to use &lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;ExtractText:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;to extract the content of the flowfile using Extract text processor by adding new properties to the processor as below.&lt;/P&gt;&lt;P&gt;Address_city
&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;"city":"(.*?)"&lt;/PRE&gt;

&lt;/DIV&gt;&lt;P&gt;Address_state
&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;"state":"(.*?)"&lt;/PRE&gt;

&lt;/DIV&gt;&lt;P&gt;Address_zipcode
&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;"zipcode":"(.*?)"&lt;/PRE&gt;

&lt;/DIV&gt;&lt;P&gt;Age
&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;,(.*?),&lt;/PRE&gt;

&lt;/DIV&gt;&lt;P&gt;Name&lt;/P&gt;&lt;PRE&gt;^(.*?),&lt;/PRE&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="49400-extracttext.png" style="width: 1838px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/18624i03DB79B5A572E2E9/image-size/medium?v=v2&amp;amp;px=400" role="button" title="49400-extracttext.png" alt="49400-extracttext.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;So in this processor we are going to extract contents of flowfile and keep them as flowfile attributes by adding matching regex.&lt;/P&gt;&lt;P&gt;To create and test regex click &lt;A target="_blank" href="https://regex101.com/" rel="nofollow noopener noreferrer"&gt;here&lt;/A&gt;.&lt;/P&gt;&lt;P&gt;You need to change Maximum Buffer Size value (default is 1MB) based on your flowfile size.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Replace Text Configs:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;In the previous step we have&lt;STRONG&gt; extracted all the contents of flowfile &lt;/STRONG&gt;based on the &lt;STRONG&gt;properties&lt;/STRONG&gt; in Replace Text processor we are going to create a &lt;STRONG&gt;new csv file with comma delimiter&lt;/STRONG&gt;(you can use any delimiter you want), By changing below properties and adding replacement value property as follows.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Configs:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Search Value
&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;(?s)(^.*$)&lt;/PRE&gt;
&lt;/DIV&gt;&lt;P&gt;Replacement Value
&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;${Name},${Age},${Address_city},${Address_state},${Address_zipcode}&lt;/PRE&gt;&lt;/DIV&gt;&lt;P&gt;Maximum Buffer Size
&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;1 MB&lt;/PRE&gt;
&lt;/DIV&gt;&lt;P&gt;Replacement Strategy
&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;Always Replace&lt;/PRE&gt;
&lt;/DIV&gt;&lt;P&gt;Evaluation Mode
&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;Entire text&lt;/PRE&gt;&lt;/DIV&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="49398-replacetext.png" style="width: 1957px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/18625iC7F7FD8F9FE0F63F/image-size/medium?v=v2&amp;amp;px=400" role="button" title="49398-replacetext.png" alt="49398-replacetext.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;So the output of the replace text processor would be &lt;/P&gt;&lt;PRE&gt;Surendra,24,Chennai,TN,24&lt;/PRE&gt;&lt;PRE&gt;Surendra,25,Chennai,TN,24&lt;/PRE&gt;&lt;P&gt;we have created a csv file without json message now but we are going to have 2 csv files(because our input data having 2 lines),if your input file having 1000 lines then we are going to end up with 1000 ourput csv files.&lt;/P&gt;&lt;P&gt;If you don't want to create 2 output files and want them to merge into 1 output file then you need to use&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Merge Content Processor:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;With the below configs,&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="49401-mergecontent.png" style="width: 1969px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/18626i5670AF83533BB993/image-size/medium?v=v2&amp;amp;px=400" role="button" title="49401-mergecontent.png" alt="49401-mergecontent.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;You need to change all the highlighted properties as per your requirements as per my configs shows Max bin age of 1 min so processor waits for 1 minute before merging all the queued flowfiles and merges them into 1 file.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Delimiter strategy to Text&lt;/STRONG&gt;(default is filename) because &lt;STRONG&gt;we need to have our contents of individual flowfile needs to add as newlines in the merged file, so we need to make use of Demarcator property as Shift+Enter(this property helps to add new contents to the newline).&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Output:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;1 file having both records in it&lt;/P&gt;&lt;PRE&gt;Surendra,24,Chennai,TN,600345
Surendra,25,Chennai,TN,609345&lt;/PRE&gt;&lt;P&gt;I highly sugges you to refer below links to get familiar with all properties in merge content processor&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.hortonworks.com/questions/149047/nifi-how-to-handle-with-mergecontent-processor.html" target="_blank" rel="nofollow noopener noreferrer"&gt;https://community.hortonworks.com/questions/149047/nifi-how-to-handle-with-mergecontent-processor.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.hortonworks.com/questions/149047/nifi-how-to-handle-with-mergecontent-processor.html" target="_blank" rel="nofollow noopener noreferrer"&gt;https://community.hortonworks.com/questions/149047/nifi-how-to-handle-with-mergecontent-processor.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;.&lt;/P&gt;&lt;P&gt;I'm attaching the xml to the post you can save the xml and import to nifi and make changes to that accordingly.&lt;A href="https://community.cloudera.com/legacyfs/online/attachments/49402-parse-file-nifi-159780.xml" target="_blank"&gt;parse-file-nifi-159780.xml&lt;/A&gt;&lt;/P&gt;&lt;P&gt;.&lt;/P&gt;&lt;P&gt;If the Answer helped to resolve your issue, &lt;STRONG&gt;Click on Accept button below to accept the answer&lt;/STRONG&gt;, That would be great help to Community users to find solution quickly for these kind of errors.&lt;/P&gt;</description>
      <pubDate>Sun, 18 Aug 2019 07:51:03 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Parse-file-in-Nifi/m-p/188653#M150746</guid>
      <dc:creator>Shu_ashu</dc:creator>
      <dc:date>2019-08-18T07:51:03Z</dc:date>
    </item>
    <item>
      <title>Re: Parse file in Nifi ?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Parse-file-in-Nifi/m-p/188654#M150747</link>
      <description>&lt;P&gt;Thanks for your overwhelming response, this will help me a great. &lt;/P&gt;</description>
      <pubDate>Sat, 13 Jan 2018 13:03:35 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Parse-file-in-Nifi/m-p/188654#M150747</guid>
      <dc:creator>sshringi003</dc:creator>
      <dc:date>2018-01-13T13:03:35Z</dc:date>
    </item>
  </channel>
</rss>

