I noticed that several processors have the ability to configure "Run Duration" which can have values from 0 to 2 sec. When increasing the value, the throughput of the these processors increases drastically. Does this value allow more flow files to go through a processor since it is running for a greater period of time? Wouldn't processors be continually running and accepting as many flow files as possible?
I'm looking for more detail about the effect of this setting.
This controls how long the Processor should be scheduled to run each time that it is triggered. On the left-hand side of the slider, it is marked 'Lower latency' while the right-hand side is marked 'Higher throughput'. When a Processor finishes running, it must update the repository in order to transfer the FlowFiles to the next Connection. Updating the repository is expensive, so the more work that can be done at once before updating the repository, the more work the Processor can handle (Higher throughput). However, this means that the next Processor cannot start processing those FlowFiles until the previous Process updates this repository. As a result, the latency will be longer (the time required to process the FlowFile from beginning to end will be longer). As a result, the slider provides a spectrum from which the DFM can choose to favor Lower Latency or Higher Throughput.