<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: ​How to scale SplitJson queues? in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/How-to-scale-SplitJson-queues/m-p/182433#M144599</link>
    <description>&lt;A rel="user" href="https://community.cloudera.com/users/87243/bruno.html" nodeid="87243" target="_blank"&gt;@Bruno Gomes de Souza&lt;/A&gt;&lt;P&gt;Make use of Record oriented processors to do your split on json array, &lt;/P&gt;&lt;P&gt;Try with the below approach&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="79457-flow.png" style="width: 1351px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/19166i30C4723BB04515FC/image-size/medium?v=v2&amp;amp;px=400" role="button" title="79457-flow.png" alt="79457-flow.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Once you feed the success relation to &lt;STRONG&gt;SplitRecord &lt;/STRONG&gt;processor then you need to define &lt;STRONG&gt;RecordReader &lt;/STRONG&gt;Controller service to read the contents of flowfile and Record Writer as &lt;STRONG&gt;JsonRecordSetWriter&lt;/STRONG&gt;.&lt;/P&gt;&lt;P&gt;Mention the Records per split property value as 1 and &lt;STRONG&gt;feed only the splits relationship&lt;/STRONG&gt; from SplitRecord processor to PublishGCPubsub processor.&lt;/P&gt;&lt;P&gt;If you find any OOM issues then it's better to use Series of SplitRecord processors to Make Records Per split to 1 message into each flowfile.&lt;/P&gt;&lt;P&gt;Refer to &lt;A href="https://community.hortonworks.com/content/kbentry/144771/ingesting-a-big-csv-file-into-kafka-using-a-multi.html" target="_blank" rel="nofollow noopener noreferrer"&gt;this&lt;/A&gt; and &lt;A href="https://community.hortonworks.com/questions/122858/nifi-splittext-big-file.html" target="_blank" rel="nofollow noopener noreferrer"&gt;this&lt;/A&gt; links regarding usage of series of split processors.&lt;/P&gt;&lt;P&gt;Refer to &lt;A href="https://community.hortonworks.com/articles/115311/convert-csv-to-json-avro-xml-using-convertrecord-p.html" target="_blank" rel="nofollow noopener noreferrer"&gt;this&lt;/A&gt; link regarding configuring Record Reader/Writer controller services.&lt;/P&gt;&lt;P&gt;-&lt;/P&gt;</description>
    <pubDate>Sun, 18 Aug 2019 08:54:37 GMT</pubDate>
    <dc:creator>Shu_ashu</dc:creator>
    <dc:date>2019-08-18T08:54:37Z</dc:date>
    <item>
      <title>​How to scale SplitJson queues?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-scale-SplitJson-queues/m-p/182432#M144598</link>
      <description>&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="80478-nifi-test-template-scale.png" style="width: 717px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/19167i44459EFEB0DEA06A/image-size/medium?v=v2&amp;amp;px=400" role="button" title="80478-nifi-test-template-scale.png" alt="80478-nifi-test-template-scale.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;I have processes that capture data from a SGDB =&amp;gt; Converts to AvroJSON =&amp;gt; SpliteJSON =&amp;gt; Publish in Google PUBSUB &lt;/P&gt;&lt;P&gt;But it is accumulating and I would like to escalate the queues during the Split (putting 3 processors) and lasts the publish in Google
it's possible?&lt;/P&gt;</description>
      <pubDate>Sun, 18 Aug 2019 08:54:45 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-scale-SplitJson-queues/m-p/182432#M144598</guid>
      <dc:creator>bruno1</dc:creator>
      <dc:date>2019-08-18T08:54:45Z</dc:date>
    </item>
    <item>
      <title>Re: ​How to scale SplitJson queues?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-scale-SplitJson-queues/m-p/182433#M144599</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/87243/bruno.html" nodeid="87243" target="_blank"&gt;@Bruno Gomes de Souza&lt;/A&gt;&lt;P&gt;Make use of Record oriented processors to do your split on json array, &lt;/P&gt;&lt;P&gt;Try with the below approach&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="79457-flow.png" style="width: 1351px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/19166i30C4723BB04515FC/image-size/medium?v=v2&amp;amp;px=400" role="button" title="79457-flow.png" alt="79457-flow.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Once you feed the success relation to &lt;STRONG&gt;SplitRecord &lt;/STRONG&gt;processor then you need to define &lt;STRONG&gt;RecordReader &lt;/STRONG&gt;Controller service to read the contents of flowfile and Record Writer as &lt;STRONG&gt;JsonRecordSetWriter&lt;/STRONG&gt;.&lt;/P&gt;&lt;P&gt;Mention the Records per split property value as 1 and &lt;STRONG&gt;feed only the splits relationship&lt;/STRONG&gt; from SplitRecord processor to PublishGCPubsub processor.&lt;/P&gt;&lt;P&gt;If you find any OOM issues then it's better to use Series of SplitRecord processors to Make Records Per split to 1 message into each flowfile.&lt;/P&gt;&lt;P&gt;Refer to &lt;A href="https://community.hortonworks.com/content/kbentry/144771/ingesting-a-big-csv-file-into-kafka-using-a-multi.html" target="_blank" rel="nofollow noopener noreferrer"&gt;this&lt;/A&gt; and &lt;A href="https://community.hortonworks.com/questions/122858/nifi-splittext-big-file.html" target="_blank" rel="nofollow noopener noreferrer"&gt;this&lt;/A&gt; links regarding usage of series of split processors.&lt;/P&gt;&lt;P&gt;Refer to &lt;A href="https://community.hortonworks.com/articles/115311/convert-csv-to-json-avro-xml-using-convertrecord-p.html" target="_blank" rel="nofollow noopener noreferrer"&gt;this&lt;/A&gt; link regarding configuring Record Reader/Writer controller services.&lt;/P&gt;&lt;P&gt;-&lt;/P&gt;</description>
      <pubDate>Sun, 18 Aug 2019 08:54:37 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-scale-SplitJson-queues/m-p/182433#M144599</guid>
      <dc:creator>Shu_ashu</dc:creator>
      <dc:date>2019-08-18T08:54:37Z</dc:date>
    </item>
    <item>
      <title>Re: ​How to scale SplitJson queues?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-scale-SplitJson-queues/m-p/182434#M144600</link>
      <description>&lt;P&gt;Thanks very much &lt;A rel="user" href="https://community.cloudera.com/users/18929/yaswanthmuppireddy.html" nodeid="18929"&gt;@Shu&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 13 Jul 2018 03:04:21 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-scale-SplitJson-queues/m-p/182434#M144600</guid>
      <dc:creator>bruno1</dc:creator>
      <dc:date>2018-07-13T03:04:21Z</dc:date>
    </item>
  </channel>
</rss>

