Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

kafka optimization

Solved Go to solution

kafka optimization

Explorer

Hi,

 

I would like to help me to optimize my cluster (3 Servers) with kafka, I understand that the default configuration is valid but I want to improve it. for example the topic of log cleaning.

I saw several forums and different criteria but if you can give me a tip, I will be grateful.

 

Greetings

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: kafka optimization

Contributor

Hey,

 

Optimizing your Kafka Cluster depends upon your cluster usage & use-case.

 

Based on your main concern like throughput or CPU utilization or Memory/Disk usage, you need to modify different parameters and some changes may have an impact on other aspects. For example, if acknowledgments is set to "all", all brokers that replicate the partitions need to acknowledge that the data was written prior to confirming the next message needs to be sent. This will ensure data consistency but increase CPU utilization and network latency.

 

Refer Benchmarking Apache Kafka: 2 Million Writes Per Second (On Three Cheap Machines) article[1] written by Jay Kreps(Co-founder and CEO at Confluent).

 

[1]https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap...

 

Please let me know if this helps.

 

Regards,

Ankit.

View solution in original post

2 REPLIES 2
Highlighted

Re: kafka optimization

Contributor

Hi @Peruvian81 

 

It's difficult to suggest if there are no details about the cluster usage, but it would be useful to start reviewing the below article that provides Kafka best practices.

 

https://community.cloudera.com/t5/Community-Articles/Kafka-Best-Practices/ta-p/249371

 

I hope that helps.

Regards,

Manuel.

Highlighted

Re: kafka optimization

Contributor

Hey,

 

Optimizing your Kafka Cluster depends upon your cluster usage & use-case.

 

Based on your main concern like throughput or CPU utilization or Memory/Disk usage, you need to modify different parameters and some changes may have an impact on other aspects. For example, if acknowledgments is set to "all", all brokers that replicate the partitions need to acknowledge that the data was written prior to confirming the next message needs to be sent. This will ensure data consistency but increase CPU utilization and network latency.

 

Refer Benchmarking Apache Kafka: 2 Million Writes Per Second (On Three Cheap Machines) article[1] written by Jay Kreps(Co-founder and CEO at Confluent).

 

[1]https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap...

 

Please let me know if this helps.

 

Regards,

Ankit.

View solution in original post

Don't have an account?
Coming from Hortonworks? Activate your account here