<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question How to split large json file into multiple json files in Nifi? in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/How-to-split-large-json-file-into-multiple-json-files-in/m-p/364136#M239107</link>
    <description>&lt;P&gt;We have a large json file which is more than 100GB and we want to split this json file into multiple files. We used &lt;STRONG&gt;Split Text&lt;/STRONG&gt;&amp;nbsp;processor to split this json file into mutliple files by specifying Line Split Count. Is there any way we can pass attribute/variable in&amp;nbsp;&lt;STRONG&gt;Line Split Count&amp;nbsp;&lt;/STRONG&gt;and then split the records based on the attribute/variable as currently&amp;nbsp;&lt;STRONG&gt;Line Split Count&amp;nbsp;&lt;/STRONG&gt;does not support attributes/variables.&lt;/P&gt;&lt;P&gt;Kindly suggest if there is another approach to split these json files based on attribute/variables&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="rahul_loke_0-1676654464405.png" style="width: 612px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/36787i361F12B4AFC46D32/image-dimensions/612x201?v=v2" width="612" height="201" role="button" title="rahul_loke_0-1676654464405.png" alt="rahul_loke_0-1676654464405.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Sample Json File&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;SPAN&gt;{"name": "John","lastName": "Wick","phoneNumber": "123123123"} &lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;{"name": "Paul","lastName": "Jackson","phoneNumber": "123123123"}&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;{"name": "Paul","lastName": "Jackson","phoneNumber": "123123123"}&lt;/SPAN&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Fri, 17 Feb 2023 17:47:56 GMT</pubDate>
    <dc:creator>rahul_loke</dc:creator>
    <dc:date>2023-02-17T17:47:56Z</dc:date>
    <item>
      <title>How to split large json file into multiple json files in Nifi?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-split-large-json-file-into-multiple-json-files-in/m-p/364136#M239107</link>
      <description>&lt;P&gt;We have a large json file which is more than 100GB and we want to split this json file into multiple files. We used &lt;STRONG&gt;Split Text&lt;/STRONG&gt;&amp;nbsp;processor to split this json file into mutliple files by specifying Line Split Count. Is there any way we can pass attribute/variable in&amp;nbsp;&lt;STRONG&gt;Line Split Count&amp;nbsp;&lt;/STRONG&gt;and then split the records based on the attribute/variable as currently&amp;nbsp;&lt;STRONG&gt;Line Split Count&amp;nbsp;&lt;/STRONG&gt;does not support attributes/variables.&lt;/P&gt;&lt;P&gt;Kindly suggest if there is another approach to split these json files based on attribute/variables&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="rahul_loke_0-1676654464405.png" style="width: 612px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/36787i361F12B4AFC46D32/image-dimensions/612x201?v=v2" width="612" height="201" role="button" title="rahul_loke_0-1676654464405.png" alt="rahul_loke_0-1676654464405.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Sample Json File&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;SPAN&gt;{"name": "John","lastName": "Wick","phoneNumber": "123123123"} &lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;{"name": "Paul","lastName": "Jackson","phoneNumber": "123123123"}&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;{"name": "Paul","lastName": "Jackson","phoneNumber": "123123123"}&lt;/SPAN&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 17 Feb 2023 17:47:56 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-split-large-json-file-into-multiple-json-files-in/m-p/364136#M239107</guid>
      <dc:creator>rahul_loke</dc:creator>
      <dc:date>2023-02-17T17:47:56Z</dc:date>
    </item>
    <item>
      <title>Re: How to split large json file into multiple json files in Nifi?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-split-large-json-file-into-multiple-json-files-in/m-p/364184#M239118</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;Try to look into QueryRecord or PartitionRecord Processors. Those might help.&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Sun, 19 Feb 2023 16:16:22 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-split-large-json-file-into-multiple-json-files-in/m-p/364184#M239118</guid>
      <dc:creator>SAMSAL</dc:creator>
      <dc:date>2023-02-19T16:16:22Z</dc:date>
    </item>
    <item>
      <title>Re: How to split large json file into multiple json files in Nifi?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-split-large-json-file-into-multiple-json-files-in/m-p/364473#M239176</link>
      <description>&lt;P&gt;Both &lt;STRONG&gt;QueryRecord&lt;/STRONG&gt; and &lt;STRONG&gt;PartitionRecord&lt;/STRONG&gt; do not fit this use case, I have tried it. Can&amp;nbsp;&lt;STRONG&gt;SplitRecord&lt;/STRONG&gt; processor be used this purpose, is yes can you provide an example based on the above sample records?&lt;/P&gt;</description>
      <pubDate>Thu, 23 Feb 2023 16:37:24 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-split-large-json-file-into-multiple-json-files-in/m-p/364473#M239176</guid>
      <dc:creator>rahul_loke</dc:creator>
      <dc:date>2023-02-23T16:37:24Z</dc:date>
    </item>
    <item>
      <title>Re: How to split large json file into multiple json files in Nifi?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-split-large-json-file-into-multiple-json-files-in/m-p/364762#M239227</link>
      <description>&lt;P&gt;Yes SplitRecord is what you should use.&lt;BR /&gt;Attached is a flow definition as an example.&lt;/P&gt;&lt;P&gt;Note that I had to rename the file with a "txt" extension once you download it rename it to a .json extension&lt;/P&gt;&lt;P&gt;You can then drag a processor group and it gives you an option to upload the flow definition.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;That example generates a file with 102 records and on SlitRecord we use a JsontTreeReader that will split by 3 records and writes the flowfiles out, In this case per 3 per flowFile generating 34 FlowFiles.&lt;/P&gt;&lt;P&gt;1-2 / 3 = 34&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In your case and based on your screenshot I would change split count to be 1500000 ( or another number based on your needs )&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 27 Feb 2023 17:25:20 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-split-large-json-file-into-multiple-json-files-in/m-p/364762#M239227</guid>
      <dc:creator>DigitalPlumber</dc:creator>
      <dc:date>2023-02-27T17:25:20Z</dc:date>
    </item>
    <item>
      <title>Re: How to split large json file into multiple json files in Nifi?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-split-large-json-file-into-multiple-json-files-in/m-p/365156#M239297</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/103565"&gt;@rahul_loke&lt;/a&gt;&amp;nbsp;Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. Thanks&lt;/P&gt;</description>
      <pubDate>Thu, 02 Mar 2023 17:42:37 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-split-large-json-file-into-multiple-json-files-in/m-p/365156#M239297</guid>
      <dc:creator>DianaTorres</dc:creator>
      <dc:date>2023-03-02T17:42:37Z</dc:date>
    </item>
  </channel>
</rss>

