<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Sales Data Split by Sales Rep? in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Sales-Data-Split-by-Sales-Rep/m-p/334411#M231751</link>
    <description>&lt;P&gt;Our sales team are extracting the months sales opportunities, actuals etc. Within the one CSV file there are potentially 100+ rows. Now, we need to create new flow files (array?) containing individual sales people. We have 20+ sales people each working on many opportunities.&lt;/P&gt;&lt;P&gt;salesOpId,salesRepId, salesOpStage, .....&lt;/P&gt;&lt;P&gt;142435,135235,inception ....&lt;/P&gt;&lt;P&gt;142436,572856,inception ....&lt;/P&gt;&lt;P&gt;142437,135235,pipelining ....&lt;/P&gt;&lt;P&gt;142435,572856,inception....&lt;/P&gt;&lt;P&gt;149468,135235,contract....&lt;/P&gt;&lt;P&gt;149464,135653,contract....&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;As you can see&lt;/P&gt;&lt;P&gt;salesRepId 135235 is working against 3 opportunities, (142436,142437,149468)&lt;/P&gt;&lt;P&gt;salesRepId&amp;nbsp;572856 is working against 2 opportunities (142436, 142435)&lt;/P&gt;&lt;P&gt;salesRepId&amp;nbsp;135653 is working on a single opportunity (149464)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;For this example I need 3 FlowFiles created (which will be sent to their respective managers); I'm capable of reading the CSV changing it to Json; but now I'm completely stumped on the next step - split the json?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any help would be greatly appreciated.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Ron M.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Sat, 22 Jan 2022 04:44:00 GMT</pubDate>
    <dc:creator>RonMilne</dc:creator>
    <dc:date>2022-01-22T04:44:00Z</dc:date>
    <item>
      <title>Sales Data Split by Sales Rep?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Sales-Data-Split-by-Sales-Rep/m-p/334411#M231751</link>
      <description>&lt;P&gt;Our sales team are extracting the months sales opportunities, actuals etc. Within the one CSV file there are potentially 100+ rows. Now, we need to create new flow files (array?) containing individual sales people. We have 20+ sales people each working on many opportunities.&lt;/P&gt;&lt;P&gt;salesOpId,salesRepId, salesOpStage, .....&lt;/P&gt;&lt;P&gt;142435,135235,inception ....&lt;/P&gt;&lt;P&gt;142436,572856,inception ....&lt;/P&gt;&lt;P&gt;142437,135235,pipelining ....&lt;/P&gt;&lt;P&gt;142435,572856,inception....&lt;/P&gt;&lt;P&gt;149468,135235,contract....&lt;/P&gt;&lt;P&gt;149464,135653,contract....&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;As you can see&lt;/P&gt;&lt;P&gt;salesRepId 135235 is working against 3 opportunities, (142436,142437,149468)&lt;/P&gt;&lt;P&gt;salesRepId&amp;nbsp;572856 is working against 2 opportunities (142436, 142435)&lt;/P&gt;&lt;P&gt;salesRepId&amp;nbsp;135653 is working on a single opportunity (149464)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;For this example I need 3 FlowFiles created (which will be sent to their respective managers); I'm capable of reading the CSV changing it to Json; but now I'm completely stumped on the next step - split the json?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any help would be greatly appreciated.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Ron M.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 22 Jan 2022 04:44:00 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Sales-Data-Split-by-Sales-Rep/m-p/334411#M231751</guid>
      <dc:creator>RonMilne</dc:creator>
      <dc:date>2022-01-22T04:44:00Z</dc:date>
    </item>
    <item>
      <title>Re: Sales Data Split by Sales Rep?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Sales-Data-Split-by-Sales-Rep/m-p/334538#M231803</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/95242"&gt;@RonMilne&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I recommend taking your initial CSV record file and partitioning by the "&lt;SPAN&gt;salesRepId" in to multiple new JSON records.&amp;nbsp; This can be accomplished using the &lt;A href="https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.15.2/org.apache.nifi.processors.standard.PartitionRecord/index.html" target="_self"&gt;PartitionRecord&lt;/A&gt; processor utilizing&amp;nbsp;a &lt;A href="https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.15.2/org.apache.nifi.csv.CSVReader/index.html" target="_self"&gt;CSVReader&lt;/A&gt; and a &lt;A href="https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.15.2/org.apache.nifi.json.JsonRecordSetWriter/index.html" target="_self"&gt;JsonRecordSetWriter&lt;/A&gt;.&amp;nbsp;&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;Your &lt;STRONG&gt;&lt;EM&gt;PartitionRecord&lt;/EM&gt;&lt;/STRONG&gt; processor configuration would look like this:&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="MattWho_0-1643145705413.png" style="width: 400px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/33247iAD1223DB9C21A3C7/image-size/medium?v=v2&amp;amp;px=400" role="button" title="MattWho_0-1643145705413.png" alt="MattWho_0-1643145705413.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Your &lt;STRONG&gt;&lt;EM&gt;CSVReader&lt;/EM&gt;&lt;/STRONG&gt; would be configured something like this (you'll need to modify it for your specific record's Schema:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="MattWho_1-1643145807713.png" style="width: 400px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/33248i33BD7D65CCFA9419/image-size/medium?v=v2&amp;amp;px=400" role="button" title="MattWho_1-1643145807713.png" alt="MattWho_1-1643145807713.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Note: Pop-out shows the "&lt;STRONG&gt;Schema Text&lt;/STRONG&gt;" property and don't forget to set "&lt;STRONG&gt;Treat First Line as Header&lt;/STRONG&gt;" property to "&lt;STRONG&gt;true&lt;/STRONG&gt;"&lt;BR /&gt;&lt;BR /&gt;The JsonRecordSetWriter would need to be configured to produce the desired JSON record output format.&amp;nbsp; However, just leaving default configuration will out put a separate FlowFile for each unique "&lt;STRONG&gt;SalesRepId&lt;/STRONG&gt;".&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If you found this response assisted with your query, please take a moment to login and click on "&lt;STRONG&gt;Accept as Solution&lt;/STRONG&gt;" below this post.&lt;BR /&gt;&lt;BR /&gt;Thank you,&lt;/P&gt;&lt;P&gt;Matt&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 25 Jan 2022 21:28:23 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Sales-Data-Split-by-Sales-Rep/m-p/334538#M231803</guid>
      <dc:creator>MattWho</dc:creator>
      <dc:date>2022-01-25T21:28:23Z</dc:date>
    </item>
  </channel>
</rss>

