08-10-2017 11:31 PM - edited 08-10-2017 11:44 PM
I am developing a plugin for accessing multiple brokers and topic parallelly.
Does it make sense to perform producer/consumer connection pooling of kafka clients?
Does kafka internally maintain a list of connection objects initialized and ready to use?
We'd like to minimize time of connection creation, so that there is no additional overhead when it comes to send/receive messages.
Please share your feedback.
08-11-2017 12:25 AM
There is no need to pool the client connections as KafkaProducer and KafkaConsumer usually keeps the broker connection open and allow you to produce/consume the events bypassing the connection creation overhead. The connection can be closed by calling close() method of each clients.
Also as a side-note, the producer is thread safe and sharing a single producer instance across threads will generally be faster than having multiple instances. However the Kafka consumer is NOT thread-safe. All network I/O happens in the thread of the application making the call. It is the responsibility of the user to ensure that multi-threaded access is properly synchronized.
08-11-2017 02:01 AM
Thank you Umesh for your quick reply.
Now i am facing some issues while trying to access HBase from Spark code. Details are in below point
It would be great if you can give suggestion on this topic also.
Currently incubating in Cloudera Labs:Envelope