Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Kafka checkpoint

avatar
Contributor

Hello Team,

 

we are doing CDC by pushing data to Kafka and another pipeline will be reading data from Kafka. Whenever we restart the second pipeline (read from Kafka to Kudu), I notice there are thousands of records coming.

 

I would like to know how Kafka keeps the checkpoints? Is there any setting to change it?

 

Thanks,

Roshan

2 REPLIES 2

avatar
Super Guru

@roshanbi ,

 

You must configure your Kafka consumer to use a consumer group and enable offset commits. This way the client will periodically save the last read offset internally in Kafka so that it can pick up from where it left off upon restarts.

 

Please check the Kafka documentation for the meaning of the properties below:

  • group.id

  • enable.auto.commit

  • auto.offset.reset

Cheers,

André

 

--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.

avatar
Community Manager

@roshanbi Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. Thanks


Regards,

Diana Torres,
Community Moderator


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community: