Support Questions

cjervis · ‎10-11-2019

Hi,

I would like to help me to optimize my cluster (3 Servers) with kafka, I understand that the default configuration is valid but I want to improve it. for example the topic of log cleaning.

I saw several forums and different criteria but if you can give me a tip, I will be grateful.

Greetings

TonyStank · ‎10-17-2019

Hey,

Optimizing your Kafka Cluster depends upon your cluster usage & use-case.

Based on your main concern like throughput or CPU utilization or Memory/Disk usage, you need to modify different parameters and some changes may have an impact on other aspects. For example, if acknowledgments is set to "all", all brokers that replicate the partitions need to acknowledge that the data was written prior to confirming the next message needs to be sent. This will ensure data consistency but increase CPU utilization and network latency.

Refer Benchmarking Apache Kafka: 2 Million Writes Per Second (On Three Cheap Machines) article[1] written by Jay Kreps(Co-founder and CEO at Confluent).

[1]https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap...

Please let me know if this helps.

Regards,

Ankit.

View solution in original post

ManuelCalvo · ‎10-15-2019

Hi @Peruvian81

It's difficult to suggest if there are no details about the cluster usage, but it would be useful to start reviewing the below article that provides Kafka best practices.

https://community.cloudera.com/t5/Community-Articles/Kafka-Best-Practices/ta-p/249371

I hope that helps.

Regards,

Manuel.

TonyStank · ‎10-17-2019

Hey,

Optimizing your Kafka Cluster depends upon your cluster usage & use-case.

Based on your main concern like throughput or CPU utilization or Memory/Disk usage, you need to modify different parameters and some changes may have an impact on other aspects. For example, if acknowledgments is set to "all", all brokers that replicate the partitions need to acknowledge that the data was written prior to confirming the next message needs to be sent. This will ensure data consistency but increase CPU utilization and network latency.

Refer Benchmarking Apache Kafka: 2 Million Writes Per Second (On Three Cheap Machines) article[1] written by Jay Kreps(Co-founder and CEO at Confluent).

[1]https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap...

Please let me know if this helps.

Regards,

Ankit.

Cloudera Community

Support Questions

kafka optimization