- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
NiFi-Kafka: How are they connected?
- Labels:
-
Apache Kafka
-
Apache NiFi
Created ‎11-10-2016 03:58 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In NiFi - ConsumeKafka processor you have the schedule interval:
In Kafka, let's consider the following properties:
session.timeout.ms = 300000 (5 mins)
heartbeat.interval.ms = 60000 (1 min)
If the processor scheduler interval is set to say 600 sec (10 min), would the processor still continue to run to maintain the heartbeat? Would the session timeout be specific to each run, per scheduled interval?
Created ‎11-10-2016 01:51 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ConsumeKafka keeps a pool of consumers behind the scenes, equal to the number of concurrent tasks for that instance. So in a simple case with ConsumeKafka having 1 concurrent task, the first time it executes it will ask the pool for a consumer, there will be none the firs time through so it will create a new one, consume the data from Kafka and then stick it back in the pool for next time.
From reading Kafka's documentation (https://kafka.apache.org/documentation) I would expect that the session timeout and heartbeat apply to the consumer while it is sitting the pool. So with the above configuration you described, I think the consumer in the pool would send heartbeats and stay active for 5 mins, then when the processor executed 5 mins later, it would have to create a new consumer from scratch.
Created ‎11-10-2016 01:51 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ConsumeKafka keeps a pool of consumers behind the scenes, equal to the number of concurrent tasks for that instance. So in a simple case with ConsumeKafka having 1 concurrent task, the first time it executes it will ask the pool for a consumer, there will be none the firs time through so it will create a new one, consume the data from Kafka and then stick it back in the pool for next time.
From reading Kafka's documentation (https://kafka.apache.org/documentation) I would expect that the session timeout and heartbeat apply to the consumer while it is sitting the pool. So with the above configuration you described, I think the consumer in the pool would send heartbeats and stay active for 5 mins, then when the processor executed 5 mins later, it would have to create a new consumer from scratch.
