I am developing a plugin for accessing multiple brokers and topic parallelly.
Does it make sense to perform producer/consumer connection pooling of kafka clients?
Does kafka internally maintain a list of connection objects initialized and ready to use?
We'd like to minimize time of connection creation, so that there is no additional overhead when it comes to send/receive messages.
Please share your feedback.
There is no need to pool the client connections as KafkaProducer and KafkaConsumer usually keeps the broker connection open and allow you to produce/consume the events bypassing the connection creation overhead. The connection can be closed by calling close() method of each clients.
Also as a side-note, the producer is thread safe and sharing a single producer instance across threads will generally be faster than having multiple instances. However the Kafka consumer is NOT thread-safe. All network I/O happens in the thread of the application making the call. It is the responsibility of the user to ensure that multi-threaded access is properly synchronized.
Thank you Umesh for your quick reply.
Now i am facing some issues while trying to access HBase from Spark code. Details are in below point
It would be great if you can give suggestion on this topic also.