Support Questions

Find answers, ask questions, and share your expertise

NiFi-Kafka: How are they connected?

avatar
Expert Contributor

In NiFi - ConsumeKafka processor you have the schedule interval:

In Kafka, let's consider the following properties:

session.timeout.ms = 300000 (5 mins)

heartbeat.interval.ms = 60000 (1 min)

If the processor scheduler interval is set to say 600 sec (10 min), would the processor still continue to run to maintain the heartbeat? Would the session timeout be specific to each run, per scheduled interval?

1 ACCEPTED SOLUTION

avatar
Master Guru

ConsumeKafka keeps a pool of consumers behind the scenes, equal to the number of concurrent tasks for that instance. So in a simple case with ConsumeKafka having 1 concurrent task, the first time it executes it will ask the pool for a consumer, there will be none the firs time through so it will create a new one, consume the data from Kafka and then stick it back in the pool for next time.

From reading Kafka's documentation (https://kafka.apache.org/documentation) I would expect that the session timeout and heartbeat apply to the consumer while it is sitting the pool. So with the above configuration you described, I think the consumer in the pool would send heartbeats and stay active for 5 mins, then when the processor executed 5 mins later, it would have to create a new consumer from scratch.

View solution in original post

1 REPLY 1

avatar
Master Guru

ConsumeKafka keeps a pool of consumers behind the scenes, equal to the number of concurrent tasks for that instance. So in a simple case with ConsumeKafka having 1 concurrent task, the first time it executes it will ask the pool for a consumer, there will be none the firs time through so it will create a new one, consume the data from Kafka and then stick it back in the pool for next time.

From reading Kafka's documentation (https://kafka.apache.org/documentation) I would expect that the session timeout and heartbeat apply to the consumer while it is sitting the pool. So with the above configuration you described, I think the consumer in the pool would send heartbeats and stay active for 5 mins, then when the processor executed 5 mins later, it would have to create a new consumer from scratch.