Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to implement a fail save kafka producer

How to implement a fail save kafka producer

Rising Star

Dear Community,

I have a Kafka Producer running in order to receive Server-sent events (SSE).

Unfortunately the events cannot be buffered at source for re-retrieval, so I have to make sure that my Kafka producer is always available receiving the events. This approach though implies the risk that if my producer is going down I would definitely loose events.

What would be the best approach from a architectural perspective to make this mechanism fail save?

I assume running two producers against the same source would generate duplicate events which I would like to avoid ...

Any idea?

Thanks, in advance,

Rainer

6 REPLIES 6

Re: How to implement a fail save kafka producer

@Rainer Geissendoerfer

See this https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=27846330

Kafka's mirroring feature makes it possible to maintain a replica of an existing Kafka cluster. The following diagram shows how to use theMirrorMaker tool to mirror a source Kafka cluster into a target (mirror) Kafka cluster. The tool uses a Kafka consumer to consume messages from the source cluster, and re-publishes those messages to the local (target) cluster using an embedded Kafka producer.

Re: How to implement a fail save kafka producer

Re: How to implement a fail save kafka producer

Rising Star

Thanks Neeraj, very good blog. Though still my problem is less on the kafka cluster/broker site but more on the unix process / kafka producer side ... I cannot affort that my producer is going down (because the source is sending the events only once) ... this leads to the idea that I am having 2 producers running against the same source ... but then I have duplicate messages ... of course 2 active/active redundant producers could write into different topics ... but anyhow I have to sort out the deduplication manually afterwards and this is what I would like to avoid ... I am searching ideas for high available kafka producers ... any ideas?

Re: How to implement a fail save kafka producer

Mentor

You can run multiple producers to multiple kafka partitions.

Re: How to implement a fail save kafka producer

@Artem Ervits Whats the reason of downvote?

Re: How to implement a fail save kafka producer

Thanks for clarifying that the problem is less on kafka side. Not sure why such good information deserved a down vote.

so..the question is to have HA for the original source. You can solve this problem by having true DR. I have seen the architecture where customer lands the data in the safe/HA zone to avoid data loss.

Please see this https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Replication @Rainer Geissendoerfer

Don't have an account?
Coming from Hortonworks? Activate your account here