Eliminating export/import Zookeeper offsets on restart of flume (kafka channel).


Components Used for data ingestion:

Kafka Channel -> Flume Custom Sink -> DB


Current implementation: 

1. Before restart of flume ->  i need to export zookeeper offset for a kafka.source.groupId

2. Restart flume

3. import the offsets. -> reason: while restarting some messages fails at sink processing (custom sink) side, and I need to re-process them.


I am not using flume transactions rollback capability as for some errors(data issue) other that point 3, the message might be failed and rollbacked indefinately. 


I wanted to know if there is a better approach available other the export/import offsets everytime I restart flume.





Super Collaborator
You shouldn't have to export/import zk offsets, unless your sink isn't handling batch transactions properly. An offset should be committed, once the sink has acknowledged that the batch has been delivered. If you sinks are not properly handling this transaction, you would want to start investigating there first.


Thanks for the response.