Created 01-04-2017 09:27 AM
I see an option 'Concurrent Tasks' in Scheduling in Nifi version 2.0. Here are my questions regarding Concurrency.
1. I see this only at indivudual processor level. Can this be set at 'Processor Group' level. In short can I parameterize it.
2. How can I arrive at this number. What are all the fators that decide concurrency. E.g. Memory, average data load etc, Other application in the same box like spark.
Created 01-04-2017 01:41 PM
Want to get a detailed solution you have to login/registered on the community
Register/LoginCreated 01-04-2017 01:41 PM
Want to get a detailed solution you have to login/registered on the community
Register/LoginCreated 11-17-2017 04:40 PM
Hi @Matt Clarke.. I have a related question regarding concurrency .. I have an issue that I have a dataflow with two connected processors (each with concurrent tasks=1), but when I set the number of threads of the whole instance to 1, the two processors still manage to somehow run concurrently although I expect them to run sequentially .. The first processor takes on average 2.5 seconds per input and the second processors takes on average 4.5 seconds.. I gave it 100 inputs and I was expecting it to finish in around 700 seconds (i.e., sequential execution) but it still manages to finish in 480 seconds which suggests that each processor is using a separate thread and they do not wait on each other. Am I missing something here ?
Created 05-14-2018 06:38 PM
@Tarek Elgamal
Assuming you are referring to settings for "Max Timer Driven Thread count"?
That setting controls the max number of threads that can execute at one time. Does not guarantee any order to the execution of threads. NiFi's controller in the back ground does not operate under this thread pool. Both processors will be scheduled to run based on their configured run schedule. Those concurrent tasks then get stacked in a request queue waiting on one of the threads from that pool to service them. This way, every processor is eventually going to get a chance to run thier code. Also keep in mind that some processors work on batches of FlowFiles while others process one FlowFile per task. Also hard to say that each processed FlowFile will take same amount of time to complete an operation. Really depends on processor and what it is designed to do.
Thanks,
Matt