- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Kafka checkpoint
- Labels:
-
Apache Kafka
-
Apache Kudu
Created 06-28-2022 05:32 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Team,
we are doing CDC by pushing data to Kafka and another pipeline will be reading data from Kafka. Whenever we restart the second pipeline (read from Kafka to Kudu), I notice there are thousands of records coming.
I would like to know how Kafka keeps the checkpoints? Is there any setting to change it?
Thanks,
Roshan
Created on 06-29-2022 01:41 AM - edited 06-29-2022 01:42 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You must configure your Kafka consumer to use a consumer group and enable offset commits. This way the client will periodically save the last read offset internally in Kafka so that it can pick up from where it left off upon restarts.
Please check the Kafka documentation for the meaning of the properties below:
group.id
enable.auto.commit
auto.offset.reset
Cheers,
André
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Created 07-01-2022 10:02 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@roshanbi Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. Thanks
Regards,
Diana Torres,Community Moderator
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
