Support Questions
Find answers, ask questions, and share your expertise

Kafka -HDP doubt - Current Offset and Commit Offset

Kafka -HDP doubt - Current Offset and Commit Offset

Expert Contributor

I gone with this HDP Kafka PDF (https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.3/bk_kafka-component-guide/bk_kafka-component-guide.pdf)

Can someone brief on this line :

"Kafka consumers keep track of which messages have already been consumed by storing the current offset."

Please also correct my understanding on Current Offset and Commit Offset :

Current Offset is taken care by Kafka Broker .

Commit Offset is taken care by Consumer .

1 REPLY 1
Highlighted

Re: Kafka -HDP doubt - Current Offset and Commit Offset

New Contributor

The offset management is one of the most important activity of kafka.

Offset is the critical value that enables consumer to read position from last read within a partition and topic.

Last committed Offset : When individual is reading from partition,so it has to know about what it has and has not read. So this is called last committed offset.So it has the confirmation about the last message already consumed. So at any given point of time a topics can have multiple partition and consumer can have multiple last committed offset depending upon partitions.

Current position:As consumer reads new records ,it also has information about current position as it is reading new records. There is a big difference between last commited offset and current offset.

There are lot of uncommited offset in between last commited offset and current offset. You can call then as 'un-commited' offset. Depending upon application developer ,it is required to manage offset enable.auto.commit = true makes kakfa the responsibility to manage when current position is committed and made as last committed offset. this is a default setting and kafka doesn't know under what circumstance kafka will commit the records. The only thing can be done is to establish commit interval time using auto.commit.interval= 1000 . But there is a challenge in above setting. If your application processing takes more time than auto.commit.interval time then there is a commit issue and kafka will commit wrong message offset ,which may not have processed due to some issue. To avoid this kind of issue kafka allows user to manage the consumer offset.

First you have to make enable.auto.commit = false

commitSync and commitAsync are two method is used to manage offset manually.

Don't have an account?