<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: NiFi: JSON to CSV to Hive in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-JSON-to-CSV-to-Hive/m-p/228517#M72615</link>
    <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/18929/yaswanthmuppireddy.html" nodeid="18929" target="_blank"&gt;@Shu&lt;/A&gt; yes that was exactly the problem, now the individual CSVs are created just fine but in the meantime another problem occured. When the individual CSVs are merged with the MergedContent processor then the Merged CSV is all in one line instead of seperate lines. Is there a way to bypass this?&lt;/P&gt;&lt;P&gt;MergeContent:&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="44442-merge.png" style="width: 1778px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/15179iD95E49EDC8FB0FDE/image-size/medium?v=v2&amp;amp;px=400" role="button" title="44442-merge.png" alt="44442-merge.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
    <pubDate>Sun, 18 Aug 2019 01:06:11 GMT</pubDate>
    <dc:creator>foivos</dc:creator>
    <dc:date>2019-08-18T01:06:11Z</dc:date>
    <item>
      <title>NiFi: JSON to CSV to Hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-JSON-to-CSV-to-Hive/m-p/228512#M72610</link>
      <description>&lt;P&gt;I have a use case where JSON files are read from an API, transformed to CSV and imported to Hive tables, however my flow fails at the replace text processor. Can you give some advice on the configuration of the processor or on where my approach fails?&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;InvokeHTTP --&amp;gt; EvaluateJsonPath --&amp;gt; ReplaceText --&amp;gt; MergeContent --&amp;gt; UpdateAttribute --&amp;gt; PutHDFS&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;My flow does several HTTP calls with InvokeHTTP (Each call with different ID), extracts attributes from each JSON that is returned (each JSON is unique) and then creates the csv's in the ReplaceText processor as following:&lt;/P&gt;&lt;P&gt;${attribute1},${attribute2},${attribute3},${attribute4},${attribute5},${attribute6},${attribute7}&lt;/P&gt;&lt;P&gt;However after the MergeContent processor inthe merged CSV there is really a lot of duplicate data while all incoming JSONs contain unique data.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="45405-capture.png" style="width: 1965px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/15187i5532126FBED8A0CC/image-size/medium?v=v2&amp;amp;px=400" role="button" title="45405-capture.png" alt="45405-capture.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="45404-repltext.png" style="width: 1969px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/15188i57B32EEA1129F73E/image-size/medium?v=v2&amp;amp;px=400" role="button" title="45404-repltext.png" alt="45404-repltext.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="45403-capture.png" style="width: 1965px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/15189iFE36D75E364E4EE7/image-size/medium?v=v2&amp;amp;px=400" role="button" title="45403-capture.png" alt="45403-capture.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="45401-capture.png" style="width: 1965px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/15190iEBF94450918AB740/image-size/medium?v=v2&amp;amp;px=400" role="button" title="45401-capture.png" alt="45401-capture.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 18 Aug 2019 01:07:19 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-JSON-to-CSV-to-Hive/m-p/228512#M72610</guid>
      <dc:creator>foivos</dc:creator>
      <dc:date>2019-08-18T01:07:19Z</dc:date>
    </item>
    <item>
      <title>Re: NiFi: JSON to CSV to Hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-JSON-to-CSV-to-Hive/m-p/228513#M72611</link>
      <description>&lt;P&gt;i ve no idea why my screenshots are doubleposted, whatever i tried to fix it fails &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 13 Dec 2017 20:42:39 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-JSON-to-CSV-to-Hive/m-p/228513#M72611</guid>
      <dc:creator>foivos</dc:creator>
      <dc:date>2017-12-13T20:42:39Z</dc:date>
    </item>
    <item>
      <title>Re: NiFi: JSON to CSV to Hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-JSON-to-CSV-to-Hive/m-p/228514#M72612</link>
      <description>&lt;P&gt;Can you share an example or two of incoming JSON data, your config for EvaluateJSONPath, and an example of the flow file after MergeContent (perhaps setting number of entries much lower to fit here)?&lt;/P&gt;</description>
      <pubDate>Wed, 13 Dec 2017 21:00:40 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-JSON-to-CSV-to-Hive/m-p/228514#M72612</guid>
      <dc:creator>mburgess</dc:creator>
      <dc:date>2017-12-13T21:00:40Z</dc:date>
    </item>
    <item>
      <title>Re: NiFi: JSON to CSV to Hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-JSON-to-CSV-to-Hive/m-p/228515#M72613</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/641/mburgess.html" nodeid="641" target="_blank"&gt;@Matt Burgess&lt;/A&gt;, here is an example of the incoming JSON files, all have same attributes:&lt;/P&gt;&lt;PRE&gt;{
 "features": [
  {
   "feature": {
    "paths": [
     [
      [
       214985.27600000054,
       427573.33100000024
      ],
      [
       215011.98900000006,
       427568.84200000018
      ],
      [
       215035.35300000012,
       427565.00499999896
      ],
      [
       215128.48900000006,
       427549.4290000014
      ],
      [
       215134.43699999899,
       427548.65599999949
      ],
      [
       215150.86800000072,
       427546.87900000066
      ],
      [
       215179.33199999854,
       427544.19799999893
      ]
     ]
    ]
   },
   "attributes": {
    "attribute1": "value",
    "attribute2": "value",
    "attribute3": "value",
    "attribute4": "value",
   }
  }
 ]
}
&lt;BR /&gt;&lt;/PRE&gt;&lt;P&gt;EvaluateJSONpath:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="44434-path.png" style="width: 1037px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/15185i7F67A9E484166150/image-size/medium?v=v2&amp;amp;px=400" role="button" title="44434-path.png" alt="44434-path.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Where i add properties for each attribute i want to parse:&lt;/P&gt;&lt;P&gt;attribute1: $.features[0].attributes.attribute1 etc. etc.&lt;/P&gt;&lt;P&gt;ReplaceText:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="44435-repltext.png" style="width: 1729px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/15186i89FE68F3F2EEA1A4/image-size/medium?v=v2&amp;amp;px=400" role="button" title="44435-repltext.png" alt="44435-repltext.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;I think something goes wrong in my configuration here, because even before the MergeContent the single CSVs created per JSON file contain hundreds of duplicate rows, whereas it should be just one row per CSV that they are gonna be later merged into a big CSV file.&lt;/P&gt;</description>
      <pubDate>Sun, 18 Aug 2019 01:06:55 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-JSON-to-CSV-to-Hive/m-p/228515#M72613</guid>
      <dc:creator>foivos</dc:creator>
      <dc:date>2019-08-18T01:06:55Z</dc:date>
    </item>
    <item>
      <title>Re: NiFi: JSON to CSV to Hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-JSON-to-CSV-to-Hive/m-p/228516#M72614</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/24249/foivos.html" nodeid="24249" target="_blank"&gt;@balalaika&lt;/A&gt;&lt;P&gt;I suspect duplicates are from Replace Text processor you have configured &lt;/P&gt;&lt;P&gt;Evaluation Mode
&lt;/P&gt;&lt;PRE&gt;Line-by-Line&lt;/PRE&gt;&lt;P&gt;That means let's take the your json having more than 1 new line, Replace text processor is going to be Replace the whole line with &lt;/P&gt;&lt;P&gt;Replacement Value
&lt;/P&gt;&lt;PRE&gt;${attribute1}${attribute2}${attribute3}&lt;/PRE&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Example:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Input:-&lt;/STRONG&gt;&lt;/P&gt;&lt;PRE&gt;{
"features": [{
"feature": {
"paths": [[[214985.27600000054,
427573.33100000024],
[215011.98900000006,
427568.84200000018],
[215035.35300000012,
427565.00499999896],
[215128.48900000006,
427549.4290000014],
[215134.43699999899,
427548.65599999949],
[215150.86800000072,
427546.87900000066],
[215179.33199999854,
427544.19799999893]]]
},
"attributes": {
"attribute1": "value",
"attribute2": "value",
"attribute3": "value",
"attribute4": "value"

}
}]
}&lt;/PRE&gt;&lt;P&gt;In this input json message &lt;STRONG&gt;we are having 27 lines &lt;/STRONG&gt;and My evaluate Json Path configs are same as you mentioned in comments.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="44436-eval-json-attribute.png" style="width: 1972px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/15180i211681A65E5738AD/image-size/medium?v=v2&amp;amp;px=400" role="button" title="44436-eval-json-attribute.png" alt="44436-eval-json-attribute.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Replace Text Configs:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="44437-replace.png" style="width: 1808px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/15181iBCB691605E3F0578/image-size/medium?v=v2&amp;amp;px=400" role="button" title="44437-replace.png" alt="44437-replace.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Output:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="44438-output.png" style="width: 815px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/15182i94CB1BCC2CFC6329/image-size/medium?v=v2&amp;amp;px=400" role="button" title="44438-output.png" alt="44438-output.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;As &lt;STRONG&gt;output we got 27 lines because we are having evaluation mode as line by line.&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;If you change the Evaluation mode to&lt;STRONG&gt; Entire text then&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Output:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="44439-output.png" style="width: 832px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/15183iABACAE4A1512ED72/image-size/medium?v=v2&amp;amp;px=400" role="button" title="44439-output.png" alt="44439-output.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;And you are having json message in one line i.e&lt;/P&gt;&lt;PRE&gt;{"features":[{"feature":{"paths":[[[214985.27600000054,427573.33100000024],[215011.98900000006,427568.84200000018],[215035.35300000012,427565.00499999896],[215128.48900000006,427549.4290000014],[215134.43699999899,427548.65599999949],[215150.86800000072,427546.87900000066],[215179.33199999854,427544.19799999893]]]},"attributes":{"attribute1":"value","attribute2":"value","attribute3":"value","attribute4":"value",}}]}&lt;/PRE&gt;&lt;P&gt;Then if you keep &lt;STRONG&gt;replace text configs as line by line or entire text it doesn't matter &lt;/STRONG&gt;because we are having just &lt;STRONG&gt;one line&lt;/STRONG&gt; as input to the processor and we will get result from&lt;STRONG&gt; replace text &lt;/STRONG&gt;as &lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="44440-output.png" style="width: 832px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/15184iC7802AD30248BC70/image-size/medium?v=v2&amp;amp;px=400" role="button" title="44440-output.png" alt="44440-output.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Try to change the configs as per your Input Json Message and run again the processor.&lt;/P&gt;&lt;P&gt;Let us know if the processor still resulting duplicate data. &lt;/P&gt;</description>
      <pubDate>Sun, 18 Aug 2019 01:06:42 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-JSON-to-CSV-to-Hive/m-p/228516#M72614</guid>
      <dc:creator>Shu_ashu</dc:creator>
      <dc:date>2019-08-18T01:06:42Z</dc:date>
    </item>
    <item>
      <title>Re: NiFi: JSON to CSV to Hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-JSON-to-CSV-to-Hive/m-p/228517#M72615</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/18929/yaswanthmuppireddy.html" nodeid="18929" target="_blank"&gt;@Shu&lt;/A&gt; yes that was exactly the problem, now the individual CSVs are created just fine but in the meantime another problem occured. When the individual CSVs are merged with the MergedContent processor then the Merged CSV is all in one line instead of seperate lines. Is there a way to bypass this?&lt;/P&gt;&lt;P&gt;MergeContent:&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="44442-merge.png" style="width: 1778px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/15179iD95E49EDC8FB0FDE/image-size/medium?v=v2&amp;amp;px=400" role="button" title="44442-merge.png" alt="44442-merge.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 18 Aug 2019 01:06:11 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-JSON-to-CSV-to-Hive/m-p/228517#M72615</guid>
      <dc:creator>foivos</dc:creator>
      <dc:date>2019-08-18T01:06:11Z</dc:date>
    </item>
    <item>
      <title>Re: NiFi: JSON to CSV to Hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-JSON-to-CSV-to-Hive/m-p/228518#M72616</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/24249/foivos.html" nodeid="24249" target="_blank"&gt;@balalaika&lt;BR /&gt;&lt;/A&gt;For that case you need to specify &lt;STRONG&gt;Demarcator property as&lt;/STRONG&gt;&lt;BR /&gt;&lt;/P&gt;&lt;PRE&gt;Shift+enter&lt;/PRE&gt;&lt;STRONG&gt;&lt;U&gt;Configs:-&lt;/U&gt;&lt;/STRONG&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="44443-merge.png" style="width: 1233px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/15178iF859D69ABE4340E1/image-size/medium?v=v2&amp;amp;px=400" role="button" title="44443-merge.png" alt="44443-merge.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;For merge content reference&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.hortonworks.com/questions/149047/nifi-how-to-handle-with-mergecontent-processor.html" target="_blank" rel="nofollow noopener noreferrer"&gt;https://community.hortonworks.com/questions/149047/nifi-how-to-handle-with-mergecontent-processor.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 18 Aug 2019 01:06:04 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-JSON-to-CSV-to-Hive/m-p/228518#M72616</guid>
      <dc:creator>Shu_ashu</dc:creator>
      <dc:date>2019-08-18T01:06:04Z</dc:date>
    </item>
  </channel>
</rss>

