Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Kafka at least once not fulfilled

Highlighted

Kafka at least once not fulfilled

Master Collaborator

Hi,

 I am using kafka-console-producer to simply cat the files and redirect them into a kafka topic. The topic has 4 partitions and Kafka has 2 brokers, replication factor is 2.

After a few days I realized that some events (rows) from the input files are not in Kafka at all. No file is missing, so this is not the case that the file was not read and redirected to kafka producer, rather the case that some rows was not "commited" or transferred to the Kafka topic. And this happens continuosly, lets say, from 200 files, 10 files are not complete, 0.1% or even less data is missing. And during this, no outage was on the system, not on OS, no service disruption on Kafka, no out of memory nothing.. Everything is green in CDH.

 

Is there any way how to push a text file to a Kafka topic via a reliable way? I tried producer property acks=1 but did not helped. 

 

Is it normal that some rows gets missing during the push?

 

Thanks

Tomas

 

Don't have an account?
Coming from Hortonworks? Activate your account here