Support Questions
Find answers, ask questions, and share your expertise

Kafka lossless producer

Kafka lossless producer

Super Collaborator

Hi,

I have a Kafka producer that reads 260k lines of a local file and send the records to a Kafka topic.

The problem is that after playing with some of the primary settings I can't get it to work without losing some of records. I have 3 brokers, the topic is configured with 6 partitions and 3 replicas.

Settings

	metric.reporters = []
	metadata.max.age.ms = 300000
	reconnect.backoff.ms = 50
	sasl.kerberos.ticket.renew.window.factor = 0.8
	bootstrap.servers = [hdp25-s-01:6667, hdp25-s-02:6667, hdp25-s-03:6667]
	ssl.keystore.type = JKS
	sasl.mechanism = GSSAPI
	max.block.ms = 60000
	interceptor.classes = null
	ssl.truststore.password = null
	client.id = Scala-Kafka-Producer
	ssl.endpoint.identification.algorithm = null
	request.timeout.ms = 60000
	acks = 1
	receive.buffer.bytes = 256200
	ssl.truststore.type = JKS
	retries = 0
	ssl.truststore.location = null
	ssl.keystore.password = null
	send.buffer.bytes = 256000
	compression.type = none
	metadata.fetch.timeout.ms = 60000
	retry.backoff.ms = 100
	sasl.kerberos.kinit.cmd = /usr/bin/kinit
	buffer.memory = 33554432
	timeout.ms = 60000
	key.serializer = class org.apache.kafka.common.serialization.StringSerializer
	sasl.kerberos.service.name = kafka
	sasl.kerberos.ticket.renew.jitter = 0.05
	ssl.trustmanager.algorithm = PKIX
	block.on.buffer.full = true
	ssl.key.password = null
	sasl.kerberos.min.time.before.relogin = 60000
	connections.max.idle.ms = 540000
	max.in.flight.requests.per.connection = 5
	metrics.num.samples = 2
	ssl.protocol = TLS
	ssl.provider = null
	ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
	batch.size = 16834200
	ssl.keystore.location = null
	ssl.cipher.suites = null
	security.protocol = SASL_PLAINTEXT
	max.request.size = 1048576
	value.serializer = class org.apache.kafka.common.serialization.StringSerializer
	ssl.keymanager.algorithm = SunX509
	metrics.sample.window.ms = 30000
	partitioner.class = class org.apache.kafka.clients.producer.internals.DefaultPartitioner
	linger.ms = 0

The best I got so far is getting 70% of the records across.

Any suggestion on how to configure a completely lossless Kafka producer?