<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: NiFi List File Processor - Interrupt run schedule and restart the run schedule when new incoming file in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/NiFi-List-File-Processor-Interrupt-run-schedule-and-restart/m-p/333531#M231460</link>
    <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/85077"&gt;@techNerd&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I think your scenario may need a bit more detail to understand what you are doing and what it is doing versus what you want the flow to do.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The ListFile only listed information about file(s) found in the target directory. It then generates a one of more FlowFiles from the listing that was performed.&amp;nbsp; A corresponding FetchFile processor would actually retrieve the content for each of the listed files.&lt;BR /&gt;&lt;BR /&gt;From the sounds of your scenario, you have instituted a 20 sec delay somehow between that ListFile and FetchFile processor?&lt;BR /&gt;&lt;BR /&gt;Or you have configured the run schedule on the ListFile processor to "20 secs"?&lt;BR /&gt;&lt;BR /&gt;Setting the run schedule only tells the processor how often it should request a thread from the NiFi controller that can be used to execute the processor code.&amp;nbsp; Once the processor gets its thread, it will execute.&amp;nbsp; The ListFile processor will list all files present in the target source directory based on the configured file and path filters.&amp;nbsp; For each File listed it will produce a FlowFile.&amp;nbsp; Run schedule does not mean it executes for a full 20 seconds continuously checking the input directory to see if new files arrive.&amp;nbsp; The run schedule also not impacted by how long it takes a listing to complete.&amp;nbsp; It will request a thread every 20 seconds (00:00:20, 00:00:40, 00:01:00, etc...).&amp;nbsp; The configured "concurrent tasks" controls whether the processor can execute multiple listing in parallel.&amp;nbsp; Let say the thread that was executed at 00:01:00 was still executing 20 seconds later. Since that thread is still using the default 1 concurrent task, the listFile would not be allowed to request another thread from the controller at that time.&lt;BR /&gt;&lt;BR /&gt;Since the run schedule is independent of the thread execution duration, there is no way to dynamically alter the schedule. There is also no way for a new file to get listed at same time as a previous file (unless both were already present at time of listing) within the same thread execution.&amp;nbsp; The listFile use the configured "Listing Strategy" to control how it handles listing of files.&amp;nbsp; A "tracking" strategy is used to prevent the ListFile processor from listing the same file twice by recording some information in a state provider or a cache.&amp;nbsp; If "No Tracking" is configured, the listFile will list all found files every time it executes.&amp;nbsp; ListFile does not remove the source file from the directory.&amp;nbsp; Removal of the source file is a function optionally handled by the corresponding FetchFile processor.&lt;BR /&gt;&lt;BR /&gt;If this is not clear, share more details around your use case and flow design specific so I can provide more direct feedback.&lt;BR /&gt;&lt;BR /&gt;Here is the documentation around processor scheduling (works the same no matter which processor is being used):&lt;BR /&gt;&lt;A href="https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#scheduling-tab" target="_blank"&gt;https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#scheduling-tab&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;If you found this response assisted with your query, please take a moment to login and click on "&lt;STRONG&gt;Accept as Solution&lt;/STRONG&gt;" below this post.&lt;BR /&gt;&lt;BR /&gt;Thank you,&lt;/P&gt;&lt;P&gt;Matt&lt;/P&gt;</description>
    <pubDate>Mon, 10 Jan 2022 21:31:03 GMT</pubDate>
    <dc:creator>MattWho</dc:creator>
    <dc:date>2022-01-10T21:31:03Z</dc:date>
    <item>
      <title>NiFi List File Processor - Interrupt run schedule and restart the run schedule when new incoming file</title>
      <link>https://community.cloudera.com/t5/Support-Questions/NiFi-List-File-Processor-Interrupt-run-schedule-and-restart/m-p/333326#M231415</link>
      <description>&lt;P&gt;&lt;STRONG&gt;Scenario:&lt;/STRONG&gt;&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;I have a List File Processor looking out for incoming file ("file 1"). I scheduled it to start picking up the "file 1" 20s after the "file 1" is downloaded.&lt;/LI&gt;&lt;LI&gt;Let assume that the List File Processor noticed the incoming "file 1" and started the delay 20 sec before picking the "file 1".&lt;/LI&gt;&lt;LI&gt;In the middle of 20 sec delay, there is an new incoming file ("file 2") noticed by the List File Processor.&lt;/LI&gt;&lt;LI&gt;List File Processor will be interrupt and reset the 20 sec schedule time when ("file 2") appear within the previous 20 sec delay&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;&lt;STRONG&gt;Could the List File Processor in the middle of 20 sec delay be interrupt and restart the initial delay? Appreciate help with explanation and example.&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Thanks.&lt;/P&gt;</description>
      <pubDate>Fri, 07 Jan 2022 06:52:33 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/NiFi-List-File-Processor-Interrupt-run-schedule-and-restart/m-p/333326#M231415</guid>
      <dc:creator>techNerd</dc:creator>
      <dc:date>2022-01-07T06:52:33Z</dc:date>
    </item>
    <item>
      <title>Re: NiFi List File Processor - Interrupt run schedule and restart the run schedule when new incoming file</title>
      <link>https://community.cloudera.com/t5/Support-Questions/NiFi-List-File-Processor-Interrupt-run-schedule-and-restart/m-p/333531#M231460</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/85077"&gt;@techNerd&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I think your scenario may need a bit more detail to understand what you are doing and what it is doing versus what you want the flow to do.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The ListFile only listed information about file(s) found in the target directory. It then generates a one of more FlowFiles from the listing that was performed.&amp;nbsp; A corresponding FetchFile processor would actually retrieve the content for each of the listed files.&lt;BR /&gt;&lt;BR /&gt;From the sounds of your scenario, you have instituted a 20 sec delay somehow between that ListFile and FetchFile processor?&lt;BR /&gt;&lt;BR /&gt;Or you have configured the run schedule on the ListFile processor to "20 secs"?&lt;BR /&gt;&lt;BR /&gt;Setting the run schedule only tells the processor how often it should request a thread from the NiFi controller that can be used to execute the processor code.&amp;nbsp; Once the processor gets its thread, it will execute.&amp;nbsp; The ListFile processor will list all files present in the target source directory based on the configured file and path filters.&amp;nbsp; For each File listed it will produce a FlowFile.&amp;nbsp; Run schedule does not mean it executes for a full 20 seconds continuously checking the input directory to see if new files arrive.&amp;nbsp; The run schedule also not impacted by how long it takes a listing to complete.&amp;nbsp; It will request a thread every 20 seconds (00:00:20, 00:00:40, 00:01:00, etc...).&amp;nbsp; The configured "concurrent tasks" controls whether the processor can execute multiple listing in parallel.&amp;nbsp; Let say the thread that was executed at 00:01:00 was still executing 20 seconds later. Since that thread is still using the default 1 concurrent task, the listFile would not be allowed to request another thread from the controller at that time.&lt;BR /&gt;&lt;BR /&gt;Since the run schedule is independent of the thread execution duration, there is no way to dynamically alter the schedule. There is also no way for a new file to get listed at same time as a previous file (unless both were already present at time of listing) within the same thread execution.&amp;nbsp; The listFile use the configured "Listing Strategy" to control how it handles listing of files.&amp;nbsp; A "tracking" strategy is used to prevent the ListFile processor from listing the same file twice by recording some information in a state provider or a cache.&amp;nbsp; If "No Tracking" is configured, the listFile will list all found files every time it executes.&amp;nbsp; ListFile does not remove the source file from the directory.&amp;nbsp; Removal of the source file is a function optionally handled by the corresponding FetchFile processor.&lt;BR /&gt;&lt;BR /&gt;If this is not clear, share more details around your use case and flow design specific so I can provide more direct feedback.&lt;BR /&gt;&lt;BR /&gt;Here is the documentation around processor scheduling (works the same no matter which processor is being used):&lt;BR /&gt;&lt;A href="https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#scheduling-tab" target="_blank"&gt;https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#scheduling-tab&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;If you found this response assisted with your query, please take a moment to login and click on "&lt;STRONG&gt;Accept as Solution&lt;/STRONG&gt;" below this post.&lt;BR /&gt;&lt;BR /&gt;Thank you,&lt;/P&gt;&lt;P&gt;Matt&lt;/P&gt;</description>
      <pubDate>Mon, 10 Jan 2022 21:31:03 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/NiFi-List-File-Processor-Interrupt-run-schedule-and-restart/m-p/333531#M231460</guid>
      <dc:creator>MattWho</dc:creator>
      <dc:date>2022-01-10T21:31:03Z</dc:date>
    </item>
  </channel>
</rss>

