Created on 12-05-2016 07:00 AM - edited 09-16-2022 03:50 AM
Good Morning Everyone!
I've been trying to use the Flume's kafka sink to send some transactional information to another system that consumes the kafka queue.
The problem is not the performance of flume (That I know of), any message that is sent to flume is consumed and sent to the kafka sink, however, the message does not appear in the kafka que for the next 3 seconds. It takes too much time for the message to be seen in the kafka queue.
I think it might me a possible kafka sink configuration, buy I'm not sure.
My flume setup is like this:
- Memory channel
- Custom source (the source pulls data from a database and send the information through the channel)
- Kafka Sink
I start counting the time to reach the kafka queue, form the time the source sends the message to the channel. This agent does not have to handle a lot of messages (Around 1-2 mesages per second) however, I'm concerned of the time it takes to reach the kafka queue.
This is my kafka sink configuration:
a3.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
a3.sinks.k1.brokerList = sbmdeqpc02:9092,sbmdeqpc03:9092,sbmdeqpc04:9092
a3.sinks.k1.topic = aud-50
a3.sinks.k1.batchSize = 10
I've tried to change the batchSize Configuration but doesn't seem to change the latency.
this is the topic description for the topic/queue
Topic:aud-50 PartitionCount:3 ReplicationFactor:1 Configs:retention.ms=86400000
Topic: aud-50 Partition: 0 Leader: 183 Replicas: 183 Isr: 183
Topic: aud-50 Partition: 1 Leader: 181 Replicas: 181 Isr: 181
Topic: aud-50 Partition: 2 Leader: 182 Replicas: 182 Isr: 182
Does anyone have this issue?, a kafka sink taking too long to put messages to the queue?.
Any help is welcome.. Thanks for your help.
Kind regards.
Rafa
Created 12-05-2016 10:51 AM
Hi Rafa,
Sorry to hear you are having trouble with performance. I suspect you are on the right track when it comes to batch sizes, but you may need some further tuning.
Could you start by posting the whole of your agent.conf (e.g. including sources and channels) as it's possible the latency is being introduced elsewhere. Also, what version of Flume/CDH are you running - the configuration of Kafka Sinks changed quite dramatically in Flume 1.7 (with the relevant Kafka bits also featuring in CDH5.8+).
There's some performance tuning tips in http://blog.cloudera.com/blog/2016/08/new-in-cloudera-enterprise-5-8-flafka-improvements-for-real-ti... (although they are geared towards increasing throughput rather than decreasing latecy, there will be some relevant settings in there).
As a bit of simple maths: if you are expecting 1-2 messages per second, with a batch size of 10, it could be waiting 5-10 seconds before a batch is received and therefore before sending on. In this instance I'd be looking to tune the batch sizes down to 1 across the board in order to ensure that messages are passed on as soon as they are received.
Please give that a try, and post some more details about your config and we'll see if we can help.
Tristan
Created 12-05-2016 10:51 AM
Hi Rafa,
Sorry to hear you are having trouble with performance. I suspect you are on the right track when it comes to batch sizes, but you may need some further tuning.
Could you start by posting the whole of your agent.conf (e.g. including sources and channels) as it's possible the latency is being introduced elsewhere. Also, what version of Flume/CDH are you running - the configuration of Kafka Sinks changed quite dramatically in Flume 1.7 (with the relevant Kafka bits also featuring in CDH5.8+).
There's some performance tuning tips in http://blog.cloudera.com/blog/2016/08/new-in-cloudera-enterprise-5-8-flafka-improvements-for-real-ti... (although they are geared towards increasing throughput rather than decreasing latecy, there will be some relevant settings in there).
As a bit of simple maths: if you are expecting 1-2 messages per second, with a batch size of 10, it could be waiting 5-10 seconds before a batch is received and therefore before sending on. In this instance I'd be looking to tune the batch sizes down to 1 across the board in order to ensure that messages are passed on as soon as they are received.
Please give that a try, and post some more details about your config and we'll see if we can help.
Tristan
Created 12-05-2016 12:18 PM