<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: How to split the output flow file from ExecuteSQL (query perform joins)? in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/How-to-split-the-output-flow-file-from-ExecuteSQL-query/m-p/302635#M221198</link>
    <description>&lt;P&gt;Maybe I have found a solution..&lt;/P&gt;&lt;P&gt;I'm gonna use the ExecuteSQL to do a "select insert" query.. The query will perform the joins and load the data into a table. Then the QueryDatabaseTable will read from the new table.. That way I'll be able to use the "Max Rows Per Floe File" property.&lt;/P&gt;</description>
    <pubDate>Thu, 10 Sep 2020 19:54:01 GMT</pubDate>
    <dc:creator>CaioFalco</dc:creator>
    <dc:date>2020-09-10T19:54:01Z</dc:date>
    <item>
      <title>How to split the output flow file from ExecuteSQL (query perform joins)?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-split-the-output-flow-file-from-ExecuteSQL-query/m-p/302321#M221074</link>
      <description>&lt;P&gt;Hello.&lt;BR /&gt;I'm using an ExecuteSQL Processor to extract data from Oracle DB. The query has multiple joins and returns a large number of fields.&lt;/P&gt;&lt;P&gt;The problem is that ExecuteSQL Processor returns a single flow file (avro format) that is huge.&amp;nbsp;&lt;/P&gt;&lt;P&gt;I want to split the flow file (based in a number of rows, for example) and then merge them at the proper moment.&lt;/P&gt;&lt;P&gt;I have read about QueryDatabaseTable and GenerateFetchTable which are processors that can split the output flow files but looks like these processors aren't able to perform joins..&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Does anyone knows a workaround?&lt;/P&gt;</description>
      <pubDate>Thu, 03 Sep 2020 14:31:15 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-split-the-output-flow-file-from-ExecuteSQL-query/m-p/302321#M221074</guid>
      <dc:creator>CaioFalco</dc:creator>
      <dc:date>2020-09-03T14:31:15Z</dc:date>
    </item>
    <item>
      <title>Re: How to split the output flow file from ExecuteSQL (query perform joins)?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-split-the-output-flow-file-from-ExecuteSQL-query/m-p/302332#M221076</link>
      <description>&lt;P&gt;At&amp;nbsp;&lt;SPAN&gt;ExecuteSQL&amp;nbsp;Processor is a property named by "Max Rows Per Flow File".&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;There you can set how much rows each Flow File should be contain and later you can merge them like you wanted cause the flow files get an fragment attribute.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 03 Sep 2020 19:10:27 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-split-the-output-flow-file-from-ExecuteSQL-query/m-p/302332#M221076</guid>
      <dc:creator>Faerballert</dc:creator>
      <dc:date>2020-09-03T19:10:27Z</dc:date>
    </item>
    <item>
      <title>Re: How to split the output flow file from ExecuteSQL (query perform joins)?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-split-the-output-flow-file-from-ExecuteSQL-query/m-p/302337#M221077</link>
      <description>&lt;P&gt;There is no such a option at ExecuteSQL..&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="CaioFalco_0-1599164411538.png" style="width: 400px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/28791i0ACF6321F2F1687D/image-size/medium?v=v2&amp;amp;px=400" role="button" title="CaioFalco_0-1599164411538.png" alt="CaioFalco_0-1599164411538.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Perhaps you're using a newer version of NiFi: Mine is&amp;nbsp;&lt;STRONG&gt;1.5.0.3.1.2.0-7&lt;/STRONG&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 03 Sep 2020 20:21:18 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-split-the-output-flow-file-from-ExecuteSQL-query/m-p/302337#M221077</guid>
      <dc:creator>CaioFalco</dc:creator>
      <dc:date>2020-09-03T20:21:18Z</dc:date>
    </item>
    <item>
      <title>Re: How to split the output flow file from ExecuteSQL (query perform joins)?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-split-the-output-flow-file-from-ExecuteSQL-query/m-p/302359#M221086</link>
      <description>&lt;P&gt;You should use ListDatabaseTable and generatetablefetch to perform an incremental load. If you are joining the tables, you can do a replacetext after generatetablefetch to add the join query and then feed the flowfile to execute sql. You can split the amount of data in generatetablefetch.&amp;nbsp;&lt;/P&gt;&lt;P&gt;OR&amp;nbsp;&lt;/P&gt;&lt;P&gt;You can use splitrecord / splitcontent to split the single avro to multiple smaller files and then use mergecontent to merge them back if required.&lt;/P&gt;&lt;P&gt;Hope this helps. If the comment helps you to find a solution or move forward, please accept it as a solution for other community members&lt;/P&gt;</description>
      <pubDate>Fri, 04 Sep 2020 07:12:52 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-split-the-output-flow-file-from-ExecuteSQL-query/m-p/302359#M221086</guid>
      <dc:creator>SagarKanani</dc:creator>
      <dc:date>2020-09-04T07:12:52Z</dc:date>
    </item>
    <item>
      <title>Re: How to split the output flow file from ExecuteSQL (query perform joins)?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-split-the-output-flow-file-from-ExecuteSQL-query/m-p/302377#M221097</link>
      <description>&lt;P&gt;I don't know if these solutions works for me.&lt;/P&gt;&lt;P&gt;What I really wanna do is make ExecuteSQL work as (for example) SelectHiveQL, which means, ExecuteSQL get only one incoming flow file and sends forward multiple flows file that can be merged.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My real problem is that the ExecuteSQL query sometimes needs to produce a flowfile whose size is too large for the edge machine to process, what ends in error.. so, I need to split the flow file to decrease the pressure on the edge machine...&amp;nbsp;&lt;BR /&gt;I've been through the same situation in Hive queries, but I solved using the "Max Rows Per Flow File" property and then merging the flowfiles.&lt;/P&gt;</description>
      <pubDate>Fri, 04 Sep 2020 13:56:58 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-split-the-output-flow-file-from-ExecuteSQL-query/m-p/302377#M221097</guid>
      <dc:creator>CaioFalco</dc:creator>
      <dc:date>2020-09-04T13:56:58Z</dc:date>
    </item>
    <item>
      <title>Re: How to split the output flow file from ExecuteSQL (query perform joins)?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-split-the-output-flow-file-from-ExecuteSQL-query/m-p/302466#M221126</link>
      <description>&lt;P&gt;Why not use the 2nd option i said above....Use splitcontent or splitrecord and then merge it later whenever you want it.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 07 Sep 2020 11:33:56 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-split-the-output-flow-file-from-ExecuteSQL-query/m-p/302466#M221126</guid>
      <dc:creator>SagarKanani</dc:creator>
      <dc:date>2020-09-07T11:33:56Z</dc:date>
    </item>
    <item>
      <title>Re: How to split the output flow file from ExecuteSQL (query perform joins)?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-split-the-output-flow-file-from-ExecuteSQL-query/m-p/302524#M221146</link>
      <description>&lt;P&gt;My point is, the error occurs when the ExecuteSQL is running. The cause is: ExecuteSQL needs to create a huge flow file that the edge machine doest not have enough processing power to create.. Your solutions looks good, but it would split the flow file after it was produced by the ExecuteSQL, your solution takes action in a moment when the error has already occurred.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 08 Sep 2020 14:17:14 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-split-the-output-flow-file-from-ExecuteSQL-query/m-p/302524#M221146</guid>
      <dc:creator>CaioFalco</dc:creator>
      <dc:date>2020-09-08T14:17:14Z</dc:date>
    </item>
    <item>
      <title>Re: How to split the output flow file from ExecuteSQL (query perform joins)?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-split-the-output-flow-file-from-ExecuteSQL-query/m-p/302525#M221147</link>
      <description>&lt;P&gt;Then im afraid its difficult to do so. I dont understand how you are feeding the queries to execute sql. Maybe its good to feed executesql with manageable queries. If you are using GenerateTableFetch then it allows you to break a big query into smaller queries like you want and feed it to ExecuteSQL. Hope this helps. Please do post back on how to managed to move forward.&lt;/P&gt;</description>
      <pubDate>Tue, 08 Sep 2020 14:34:37 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-split-the-output-flow-file-from-ExecuteSQL-query/m-p/302525#M221147</guid>
      <dc:creator>SagarKanani</dc:creator>
      <dc:date>2020-09-08T14:34:37Z</dc:date>
    </item>
    <item>
      <title>Re: How to split the output flow file from ExecuteSQL (query perform joins)?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-split-the-output-flow-file-from-ExecuteSQL-query/m-p/302528#M221148</link>
      <description>&lt;P&gt;The process is triggered by a GetSFTP that retrieves a date file.. Then the query uses the date for filtering..&lt;/P&gt;&lt;P&gt;Thanks for your contribution.. As soon as I make progress I'll update you guys.&lt;/P&gt;</description>
      <pubDate>Tue, 08 Sep 2020 14:45:30 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-split-the-output-flow-file-from-ExecuteSQL-query/m-p/302528#M221148</guid>
      <dc:creator>CaioFalco</dc:creator>
      <dc:date>2020-09-08T14:45:30Z</dc:date>
    </item>
    <item>
      <title>Re: How to split the output flow file from ExecuteSQL (query perform joins)?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-split-the-output-flow-file-from-ExecuteSQL-query/m-p/302635#M221198</link>
      <description>&lt;P&gt;Maybe I have found a solution..&lt;/P&gt;&lt;P&gt;I'm gonna use the ExecuteSQL to do a "select insert" query.. The query will perform the joins and load the data into a table. Then the QueryDatabaseTable will read from the new table.. That way I'll be able to use the "Max Rows Per Floe File" property.&lt;/P&gt;</description>
      <pubDate>Thu, 10 Sep 2020 19:54:01 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-split-the-output-flow-file-from-ExecuteSQL-query/m-p/302635#M221198</guid>
      <dc:creator>CaioFalco</dc:creator>
      <dc:date>2020-09-10T19:54:01Z</dc:date>
    </item>
  </channel>
</rss>

