I have a Storm topology that reads from Topic A, does an operation, and writes 1-1 to Topic B. Is there a way to limit to feed rate into Topic A from Nifi, based on the read rate from Topic B (i.e. enforce a maximum lag between writing to A, and reading from B)?
So the left part of your picture is publishing to Topic A? and the right part is consuming from Topic B?
Can you give any more background about why you want to achieve this? One of the benefits to using Kafka is that the producers and consumers are completely disconnected and can publish and consume however they like, so it seems odd for a producer to change its behavior based on some other consumer, but I don't know enough about the use-case to understand.
Hi Bryan, that's correct left is publishing to A, right is consuming from B. Both topics have a log.retention.bytes set to a relatively small amount. During normal operation, the data from the left hand side is streaming in continuously, and the A --> B processor would run and keep the lag low. However, we have a large dataset of historical data that we need to process, and we don't want to flood topic A before the processor can convert it to topic B
You could slow down the left side using a ControlRate processor before PublishKafka, although it is not directly tied how fast you are consuming from B. Not sure there are may other options since they are completely disconnected parts of the flow.