Support Questions

Find answers, ask questions, and share your expertise

kafka optimization

avatar
Explorer

Hi,

 

I would like to help me to optimize my cluster (3 Servers) with kafka, I understand that the default configuration is valid but I want to improve it. for example the topic of log cleaning.

I saw several forums and different criteria but if you can give me a tip, I will be grateful.

 

Greetings

1 ACCEPTED SOLUTION

avatar
Expert Contributor

Hey,

 

Optimizing your Kafka Cluster depends upon your cluster usage & use-case.

 

Based on your main concern like throughput or CPU utilization or Memory/Disk usage, you need to modify different parameters and some changes may have an impact on other aspects. For example, if acknowledgments is set to "all", all brokers that replicate the partitions need to acknowledge that the data was written prior to confirming the next message needs to be sent. This will ensure data consistency but increase CPU utilization and network latency.

 

Refer Benchmarking Apache Kafka: 2 Million Writes Per Second (On Three Cheap Machines) article[1] written by Jay Kreps(Co-founder and CEO at Confluent).

 

[1]https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap...

 

Please let me know if this helps.

 

Regards,

Ankit.

View solution in original post

2 REPLIES 2

avatar
Expert Contributor

Hi @Peruvian81 

 

It's difficult to suggest if there are no details about the cluster usage, but it would be useful to start reviewing the below article that provides Kafka best practices.

 

https://community.cloudera.com/t5/Community-Articles/Kafka-Best-Practices/ta-p/249371

 

I hope that helps.

Regards,

Manuel.

avatar
Expert Contributor

Hey,

 

Optimizing your Kafka Cluster depends upon your cluster usage & use-case.

 

Based on your main concern like throughput or CPU utilization or Memory/Disk usage, you need to modify different parameters and some changes may have an impact on other aspects. For example, if acknowledgments is set to "all", all brokers that replicate the partitions need to acknowledge that the data was written prior to confirming the next message needs to be sent. This will ensure data consistency but increase CPU utilization and network latency.

 

Refer Benchmarking Apache Kafka: 2 Million Writes Per Second (On Three Cheap Machines) article[1] written by Jay Kreps(Co-founder and CEO at Confluent).

 

[1]https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap...

 

Please let me know if this helps.

 

Regards,

Ankit.