Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Flume's Kafka Sink - Latency to reach the Queue

avatar
Explorer

Good Morning Everyone!

 

I've been trying to use the Flume's kafka sink to send some transactional information to another system that consumes the kafka queue.

 

The problem is not the performance of flume (That I know of), any message that is sent to flume is consumed and sent to the kafka sink, however, the message does not appear in the kafka que for the next 3 seconds.  It takes too much time for the message to be seen in the kafka queue.  

 

I think it might me a possible kafka sink configuration, buy I'm not sure.

 

My flume setup is like this:

 

- Memory channel

- Custom source (the source pulls data from a database and send the information through the channel)

- Kafka Sink

 

I start counting the time to reach the kafka queue, form the time the source sends the message to the channel.  This agent does not have to handle a lot of messages (Around 1-2 mesages per second) however, I'm concerned of the time it takes to reach the kafka queue.

 

This is my kafka sink configuration:

 

a3.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
a3.sinks.k1.brokerList = sbmdeqpc02:9092,sbmdeqpc03:9092,sbmdeqpc04:9092
a3.sinks.k1.topic = aud-50
a3.sinks.k1.batchSize = 10

 

I've tried to change the batchSize Configuration but doesn't seem to change the latency.

 

this is the topic description for the topic/queue

 

Topic:aud-50 PartitionCount:3 ReplicationFactor:1 Configs:retention.ms=86400000
Topic: aud-50 Partition: 0 Leader: 183 Replicas: 183 Isr: 183
Topic: aud-50 Partition: 1 Leader: 181 Replicas: 181 Isr: 181
Topic: aud-50 Partition: 2 Leader: 182 Replicas: 182 Isr: 182

 

Does anyone have this issue?, a kafka sink taking too long to put messages to the queue?.

 

Any help is welcome.. Thanks for your help.

 

Kind regards.

 

Rafa

 

 

 

 

1 ACCEPTED SOLUTION

avatar
New Contributor
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
2 REPLIES 2

avatar
New Contributor
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar
Explorer
Hello Tristan

Thanks a lot for your response, as you said, the issue was on the batchSize configuration of the kafka sink. Given that i only expected a couple of messages per second, having a batchsize of 10 was not needed. Putting the batch Size equal to 1 solved the "latency" I was seeing. I guess that if the messages arrive at a rate of thousands per second, having a larger batchSize could be much more efficient. At the end I guess it was more of a problem of type PEBKAC than Flume's problem haha! 😛

Just to let you know, I'm using a somewhat "older" distribution (CDH 5.5), so I don't have the newer performance improvements you linked me, however, the problem was removed changing the batchSize configuration as I said before. We are planing to upgrade our distribution in the coming months so I hope to use the newer performance enhancements soon!.

Again, thanks a lot for your help and have a nice day!

Rafa