Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Eliminating export/import Zookeeper offsets on restart of flume (kafka channel).

Eliminating export/import Zookeeper offsets on restart of flume (kafka channel).

Explorer

Components Used for data ingestion:

Kafka Channel -> Flume Custom Sink -> DB

 

Current implementation: 

1. Before restart of flume ->  i need to export zookeeper offset for a kafka.source.groupId

2. Restart flume

3. import the offsets. -> reason: while restarting some messages fails at sink processing (custom sink) side, and I need to re-process them.

 

I am not using flume transactions rollback capability as for some errors(data issue) other that point 3, the message might be failed and rollbacked indefinately. 

 

I wanted to know if there is a better approach available other the export/import offsets everytime I restart flume.

 

 

 

2 REPLIES 2

Re: Eliminating export/import Zookeeper offsets on restart of flume (kafka channel).

Super Collaborator
You shouldn't have to export/import zk offsets, unless your sink isn't handling batch transactions properly. An offset should be committed, once the sink has acknowledged that the batch has been delivered. If you sinks are not properly handling this transaction, you would want to start investigating there first.

-pd

Re: Eliminating export/import Zookeeper offsets on restart of flume (kafka channel).

Explorer
Thanks for the response.