Created 06-28-2022 05:32 PM
Hello Team,
we are doing CDC by pushing data to Kafka and another pipeline will be reading data from Kafka. Whenever we restart the second pipeline (read from Kafka to Kudu), I notice there are thousands of records coming.
I would like to know how Kafka keeps the checkpoints? Is there any setting to change it?
Thanks,
Roshan
Created on 06-29-2022 01:41 AM - edited 06-29-2022 01:42 AM
You must configure your Kafka consumer to use a consumer group and enable offset commits. This way the client will periodically save the last read offset internally in Kafka so that it can pick up from where it left off upon restarts.
Please check the Kafka documentation for the meaning of the properties below:
group.id
enable.auto.commit
auto.offset.reset
Cheers,
André
Created 07-01-2022 10:02 AM
@roshanbi Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. Thanks
Regards,
Diana Torres,