<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Nifi tuning for a high number of tasks in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Nifi-tuning-for-a-high-number-of-tasks/m-p/401542#M251212</link>
    <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;We are using a Nifi cluster with 5 nodes. Each node is a machine with 48 cores and 280-300 Gi of available memory. The issue we currently have is Nifi having trouble keeping up when a high number of tasks is required, 1,000,000+ tasks. The files are eventually transferred but errors do pop up in the mean time that seem to be irrelevant and more related to the flow trying to keep up. It's a simple flow with a GetFile &amp;gt; UpdateAttribute &amp;gt; PutSFTP &amp;gt; LogMesssage. I have increased concurrent tasks and batch sizes but this has little to no effect.&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Wed, 05 Feb 2025 20:27:45 GMT</pubDate>
    <dc:creator>jfs912</dc:creator>
    <dc:date>2025-02-05T20:27:45Z</dc:date>
    <item>
      <title>Nifi tuning for a high number of tasks</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Nifi-tuning-for-a-high-number-of-tasks/m-p/401542#M251212</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;We are using a Nifi cluster with 5 nodes. Each node is a machine with 48 cores and 280-300 Gi of available memory. The issue we currently have is Nifi having trouble keeping up when a high number of tasks is required, 1,000,000+ tasks. The files are eventually transferred but errors do pop up in the mean time that seem to be irrelevant and more related to the flow trying to keep up. It's a simple flow with a GetFile &amp;gt; UpdateAttribute &amp;gt; PutSFTP &amp;gt; LogMesssage. I have increased concurrent tasks and batch sizes but this has little to no effect.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 05 Feb 2025 20:27:45 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Nifi-tuning-for-a-high-number-of-tasks/m-p/401542#M251212</guid>
      <dc:creator>jfs912</dc:creator>
      <dc:date>2025-02-05T20:27:45Z</dc:date>
    </item>
    <item>
      <title>Re: Nifi tuning for a high number of tasks</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Nifi-tuning-for-a-high-number-of-tasks/m-p/401543#M251213</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/123805"&gt;@jfs912&lt;/a&gt;&amp;nbsp;Welcome to the Cloudera Community!&lt;BR /&gt;&lt;BR /&gt;To help you get the best possible solution, I have tagged our NiFi experts&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/35454"&gt;@MattWho&lt;/a&gt;&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/80381"&gt;@SAMSAL&lt;/a&gt;&amp;nbsp; who may be able to assist you further.&lt;BR /&gt;&lt;BR /&gt;Please keep us updated on your post, and we hope you find a satisfactory solution to your query.&lt;/P&gt;</description>
      <pubDate>Thu, 06 Feb 2025 01:38:23 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Nifi-tuning-for-a-high-number-of-tasks/m-p/401543#M251213</guid>
      <dc:creator>DianaTorres</dc:creator>
      <dc:date>2025-02-06T01:38:23Z</dc:date>
    </item>
    <item>
      <title>Re: Nifi tuning for a high number of tasks</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Nifi-tuning-for-a-high-number-of-tasks/m-p/401587#M251237</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/123805"&gt;@jfs912&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;You should &lt;STRONG&gt;not&lt;/STRONG&gt; be configuring your NiFi with larger then necessary heap.&amp;nbsp; Doing so just leads to very long stop-the-world garbage collection events.&amp;nbsp; &amp;nbsp;The simple flow you have described would use very little heap memory.&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;So you have a 5 node cluster and each nodes has files in some local directory that each node is pulling from?&lt;/LI&gt;&lt;LI&gt;Is that local directory a mounted directory that is mounted to all nodes or each nodes has its own set of files in the local directly from which getFile is pulling from?&lt;/LI&gt;&lt;LI&gt;Are you seeing backpressure being applied on any of the connections between your processors?&amp;nbsp; When backpressure is being applied to the upstream processor, NiFi will not schedule that upstream processor until that backpressure is removed.&lt;/LI&gt;&lt;LI&gt;If you can tolerate some latency in your dataflow, you can get better throughput performance with some processors by increasing the Run Duration as well.&lt;/LI&gt;&lt;LI&gt;Dataflow design best practices and designs can also improve performance and better load distribution across all the nodes in your cluster.&amp;nbsp; You want to minimize as much as possible one node doing bulk of the work load.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;BR /&gt;Adjusting concurrent tasks has multiple elements to it.&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;What is the current CPU load average on each of yoru 5 servers?&amp;nbsp; First need to know if there is capacity to run more parallel threads.&lt;/LI&gt;&lt;LI&gt;How large is the configured timer driven thread pool in NiFi?&amp;nbsp; It is from this configured thread pool that all concurrent tasks used by processor components comes from.&amp;nbsp; If this pool is small, adding more concurrent tasks to processors will improve nothing.&amp;nbsp; Ability to increase the size of this thread pool is dependent on node's cpu load average.&amp;nbsp; Thread pool is also applied per node.&amp;nbsp; So when set to 10 that is 10 threads per each node in your 5 node cluster.&amp;nbsp;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="MattWho_0-1738863533317.png" style="width: 638px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/43779iB4733516CE239A33/image-dimensions/638x466?v=v2" width="638" height="466" role="button" title="MattWho_0-1738863533317.png" alt="MattWho_0-1738863533317.png" /&gt;&lt;/span&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;If cpu load average is not high and you increase the size of the Timer Driven Thread pool, you'll want to make small incremental changes to the concurrent tasks on processor and monitor impact on CPU load average.&lt;BR /&gt;&lt;BR /&gt;&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;Please help our community grow and thrive. If you found&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;any&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "&lt;SPAN&gt;&lt;EM&gt;&lt;STRONG&gt;&lt;FONT color="#FF0000"&gt;Accept as Solution&lt;/FONT&gt;&lt;/STRONG&gt;&lt;/EM&gt;" on&amp;nbsp;&lt;STRONG&gt;one or more&lt;/STRONG&gt;&amp;nbsp;of them that helped.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Thank you,&lt;BR /&gt;Matt&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 06 Feb 2025 17:49:58 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Nifi-tuning-for-a-high-number-of-tasks/m-p/401587#M251237</guid>
      <dc:creator>MattWho</dc:creator>
      <dc:date>2025-02-06T17:49:58Z</dc:date>
    </item>
    <item>
      <title>Re: Nifi tuning for a high number of tasks</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Nifi-tuning-for-a-high-number-of-tasks/m-p/401861#M251331</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/123805"&gt;@jfs912&lt;/a&gt;&amp;nbsp;Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. Thanks.&lt;/P&gt;</description>
      <pubDate>Tue, 11 Feb 2025 15:19:22 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Nifi-tuning-for-a-high-number-of-tasks/m-p/401861#M251331</guid>
      <dc:creator>DianaTorres</dc:creator>
      <dc:date>2025-02-11T15:19:22Z</dc:date>
    </item>
  </channel>
</rss>

