<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: [Apache Nifi] Split a flowfile based on json-attribute of each record in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Apache-Nifi-Split-a-flowfile-based-on-json-attribute-of-each/m-p/198251#M71395</link>
    <description>&lt;P&gt;It appears you want to set the destination path to the value of type, followed by the value of id, followed by data.txt, and in the content of that file you want the single-element JSON array containing the object that provided the values. If that is the case:&lt;/P&gt;&lt;P&gt;As of NiFi 1.3.0, there is a PartitionRecord processor which will do most of what you want. You can create a JsonReader using the following example schema:&lt;/P&gt;&lt;PRE&gt;{"type":"record","name":"test","namespace":"nifi",
  "fields": [
    {"name":"type","type":"string"},
    {"name":"id","type":"string"},
    {"name":"content","type":"string"}
  ]
}&lt;/PRE&gt;&lt;P&gt;You can also create a JsonRecordSetWriter that inherits the schema (as of NiFi 1.4.0) or uses the same one (prior to NiFi 1.4.0). Then in PartitionRecord you would create two user-defined properties, say record.type and record.id, configured as follows:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="43617-partitionrecordexample.png" style="width: 518px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/17785iB442F4E24FEA8B3A/image-size/medium?v=v2&amp;amp;px=400" role="button" title="43617-partitionrecordexample.png" alt="43617-partitionrecordexample.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Given your example data, you will get 4 flow files, each containing the data from the 4 groups you mention above. Additionally you have record.type and record.id attributes on those flow files. You can route them to UpdateAttribute where you set filename to data.txt and absolute.path to /${type}/${id}.  Then you can send them to PutHDFS where you set the Directory to ${absolute.path}.&lt;/P&gt;</description>
    <pubDate>Sun, 18 Aug 2019 06:11:10 GMT</pubDate>
    <dc:creator>mburgess</dc:creator>
    <dc:date>2019-08-18T06:11:10Z</dc:date>
    <item>
      <title>[Apache Nifi] Split a flowfile based on json-attribute of each record</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Apache-Nifi-Split-a-flowfile-based-on-json-attribute-of-each/m-p/198250#M71394</link>
      <description>&lt;P&gt;
	I have a very simple use case but unable to come up with the right combination of processors. So, consider my flowfile coming from a HDFS json file with the following content- &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;[{"type":"A","id":"001","content":"abc"},
{"type":"A","id":"001","content":"xyz"},
{"type":"A","id":"002","content":"sdf"},
{"type":"B","id":"004","content":"df"},
{"type":"B","id":"002","content":"dsg"},
{"type":"B","id":"002","content":"sfg"},
{"type":"B","id":"004","content":"sfg"}]&lt;/PRE&gt;&lt;P&gt;&lt;BR /&gt;
	I want to finally store these in the following directory structure back in HDFS.&lt;BR /&gt;&lt;/P&gt;&lt;PRE&gt;/A/001/data.txt (data.txt-&amp;gt; [{"type":"A","id":"001","content":"abc"},{"type":"A","id":"001","content":"xyz"}])
/A/002/data.txt (data.txt-&amp;gt; [{"type":"A","id":"002","content":"sdf"}])
/B/002/data.txt (data.txt-&amp;gt; [{"type":"B","id":"002","content":"dsg"},{"type":"B","id":"002","content":"sfg"}])
/B/004/data.txt (data.txt-&amp;gt; [{"type":"B","id":"004","content":"df"},{"type":"B","id":"004","content":"sfg"}])&lt;/PRE&gt;&lt;P&gt;
	Any ideas?&lt;/P&gt;</description>
      <pubDate>Thu, 16 Nov 2017 13:07:49 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Apache-Nifi-Split-a-flowfile-based-on-json-attribute-of-each/m-p/198250#M71394</guid>
      <dc:creator>maloochandra</dc:creator>
      <dc:date>2017-11-16T13:07:49Z</dc:date>
    </item>
    <item>
      <title>Re: [Apache Nifi] Split a flowfile based on json-attribute of each record</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Apache-Nifi-Split-a-flowfile-based-on-json-attribute-of-each/m-p/198251#M71395</link>
      <description>&lt;P&gt;It appears you want to set the destination path to the value of type, followed by the value of id, followed by data.txt, and in the content of that file you want the single-element JSON array containing the object that provided the values. If that is the case:&lt;/P&gt;&lt;P&gt;As of NiFi 1.3.0, there is a PartitionRecord processor which will do most of what you want. You can create a JsonReader using the following example schema:&lt;/P&gt;&lt;PRE&gt;{"type":"record","name":"test","namespace":"nifi",
  "fields": [
    {"name":"type","type":"string"},
    {"name":"id","type":"string"},
    {"name":"content","type":"string"}
  ]
}&lt;/PRE&gt;&lt;P&gt;You can also create a JsonRecordSetWriter that inherits the schema (as of NiFi 1.4.0) or uses the same one (prior to NiFi 1.4.0). Then in PartitionRecord you would create two user-defined properties, say record.type and record.id, configured as follows:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="43617-partitionrecordexample.png" style="width: 518px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/17785iB442F4E24FEA8B3A/image-size/medium?v=v2&amp;amp;px=400" role="button" title="43617-partitionrecordexample.png" alt="43617-partitionrecordexample.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Given your example data, you will get 4 flow files, each containing the data from the 4 groups you mention above. Additionally you have record.type and record.id attributes on those flow files. You can route them to UpdateAttribute where you set filename to data.txt and absolute.path to /${type}/${id}.  Then you can send them to PutHDFS where you set the Directory to ${absolute.path}.&lt;/P&gt;</description>
      <pubDate>Sun, 18 Aug 2019 06:11:10 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Apache-Nifi-Split-a-flowfile-based-on-json-attribute-of-each/m-p/198251#M71395</guid>
      <dc:creator>mburgess</dc:creator>
      <dc:date>2019-08-18T06:11:10Z</dc:date>
    </item>
    <item>
      <title>Re: [Apache Nifi] Split a flowfile based on json-attribute of each record</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Apache-Nifi-Split-a-flowfile-based-on-json-attribute-of-each/m-p/198252#M71396</link>
      <description>&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.hortonworks.com/users/641/mburgess.html"&gt;@Matt Burgess&lt;/A&gt;,&lt;/P&gt;&lt;P&gt;do we need to have Schema Registry(SR) to use Schemas or can we do this without SR.?&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Sai&lt;/P&gt;</description>
      <pubDate>Thu, 16 Nov 2017 23:16:49 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Apache-Nifi-Split-a-flowfile-based-on-json-attribute-of-each/m-p/198252#M71396</guid>
      <dc:creator>saikrishna_tara</dc:creator>
      <dc:date>2017-11-16T23:16:49Z</dc:date>
    </item>
    <item>
      <title>Re: [Apache Nifi] Split a flowfile based on json-attribute of each record</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Apache-Nifi-Split-a-flowfile-based-on-json-attribute-of-each/m-p/198253#M71397</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/641/mburgess.html" nodeid="641"&gt;@Matt Burgess&lt;/A&gt; &lt;/P&gt;&lt;P&gt;Nevermind , I was able to do this using AvroSchemaRegistry. Thank you.&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Sai&lt;/P&gt;</description>
      <pubDate>Fri, 17 Nov 2017 00:31:27 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Apache-Nifi-Split-a-flowfile-based-on-json-attribute-of-each/m-p/198253#M71397</guid>
      <dc:creator>saikrishna_tara</dc:creator>
      <dc:date>2017-11-17T00:31:27Z</dc:date>
    </item>
    <item>
      <title>Re: [Apache Nifi] Split a flowfile based on json-attribute of each record</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Apache-Nifi-Split-a-flowfile-based-on-json-attribute-of-each/m-p/198254#M71398</link>
      <description>&lt;P&gt;You can do it without a schema registry, if your readers and writers "Use 'Schema Text' Property" and you hardcode the schema into the Schema Text property. Since you're using the same for both reader and writer, it's easier to maintain in a registry, but only a simple copy-paste if you don't want to use the registry.&lt;/P&gt;</description>
      <pubDate>Fri, 17 Nov 2017 00:53:05 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Apache-Nifi-Split-a-flowfile-based-on-json-attribute-of-each/m-p/198254#M71398</guid>
      <dc:creator>mburgess</dc:creator>
      <dc:date>2017-11-17T00:53:05Z</dc:date>
    </item>
    <item>
      <title>Re: [Apache Nifi] Split a flowfile based on json-attribute of each record</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Apache-Nifi-Split-a-flowfile-based-on-json-attribute-of-each/m-p/198255#M71399</link>
      <description>&lt;P&gt;Precisely what I needed. Thanks!&lt;/P&gt;</description>
      <pubDate>Fri, 17 Nov 2017 10:55:32 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Apache-Nifi-Split-a-flowfile-based-on-json-attribute-of-each/m-p/198255#M71399</guid>
      <dc:creator>maloochandra</dc:creator>
      <dc:date>2017-11-17T10:55:32Z</dc:date>
    </item>
  </channel>
</rss>

