Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How kafka survives data loss and data duplication in a cluster?

How kafka survives data loss and data duplication in a cluster?

Explorer
 
1 REPLY 1
Highlighted

Re: How kafka survives data loss and data duplication in a cluster?

Master Guru
Have you gone over Kafka documentation? Are there specific parts or scenarios beyond the ones mentioned that you have these questions about?

> data loss

Kafka provides topic partition replication.

> data duplication

Kafka does not do anything specific for deduplicating data. Assuming you're asking about exactly-once processing semantics, it depends on your application and how it leverages Kafka. One record of this is at https://www.confluent.io/blog/exactly-once-semantics-are-possible-heres-how-apache-kafka-does-it/