Support Questions

Find answers, ask questions, and share your expertise

Rate of Publishing - kafka processor

avatar
Contributor

I would like to know if the Run schedule stands for "rate at which the processor is publishing or writing into another processor like "Put File"

I am publishing kafka into a topic from where kafka streams is called and then so on. For performance testing, I would like to fix the rate at which the log is written into topic. Can anybody suggest me how?
For eg. 100 records/log lines per second.

1 ACCEPTED SOLUTION

avatar
Master Guru

The Run Schedule is the schedule of when the NiFi framework will execute a processor. The default of timer driver 0 seconds means to execute as fast as possible when there is data available in the incoming queue, if no data is there then it doesn't execute.

The rate of the data depends on what the processor does during one execution... for example, lets say a queue has 100 flow files in it and you set the processor to run every 5 minutes. Some processors may grab a batch of files during one execution, so even tough the processor executes once, it may grab 50 of those flow files. It also depends if your flows files have multiple logical messages in the content. If you have 1 record per flow file, and if the processor only grabs 1 flow file at a time (most only take one at a time), then the run schedule does control the rate.

You can look at ControlRate processor as well.

View solution in original post

1 REPLY 1

avatar
Master Guru

The Run Schedule is the schedule of when the NiFi framework will execute a processor. The default of timer driver 0 seconds means to execute as fast as possible when there is data available in the incoming queue, if no data is there then it doesn't execute.

The rate of the data depends on what the processor does during one execution... for example, lets say a queue has 100 flow files in it and you set the processor to run every 5 minutes. Some processors may grab a batch of files during one execution, so even tough the processor executes once, it may grab 50 of those flow files. It also depends if your flows files have multiple logical messages in the content. If you have 1 record per flow file, and if the processor only grabs 1 flow file at a time (most only take one at a time), then the run schedule does control the rate.

You can look at ControlRate processor as well.