Created 01-24-2019 05:00 PM
hi all
I want to validate the balanced of the topics , include replica and Isr
I will give some example of wrong balanced ( master01 is the zookeper server , and we defined 3 replicate )
on that example we can see that kafka brokers are missing from some Isr
/usr/hdp/current/kafka-broker/bin/kafka-topics.sh --describe --zookeeper master01:2181 --topic mno.de.pola.trump Topic:mno.de.pola.trump PartitionCount:100 ReplicationFactor:3 Configs: Topic: mno.de.pola.trump Partition: 0 Leader: 1017 Replicas: 1017,1018,1016 Isr: 1017,1018,1016 Topic: mno.de.pola.trump Partition: 1 Leader: 1018 Replicas: 1018,1016,1017 Isr: 1018,1017,1016 Topic: mno.de.pola.trump Partition: 2 Leader: 1016 Replicas: 1016,1017,1018 Isr: 1016,1018 Topic: mno.de.pola.trump Partition: 3 Leader: 1017 Replicas: 1017,1016,1018 Isr: 1017,1018,1016 Topic: mno.de.pola.trump Partition: 4 Leader: 1018 Replicas: 1018,1017,1016 Isr: 1018,1017,1016 Topic: mno.de.pola.trump Partition: 5 Leader: 1016 Replicas: 1016,1018,1017 Isr: 1016,1018,1017 Topic: mno.de.pola.trump Partition: 6 Leader: 1017 Replicas: 1017,1018,1016 Isr: 1017,1018
on the following example we can see that Leader on partitions are more on broker 1017
/usr/hdp/current/kafka-broker/bin/kafka-topics.sh --describe --zookeeper master01:2181 --topic mno.de.pola.trump Topic:mno.de.pola.trump PartitionCount:100 ReplicationFactor:3 Configs: Topic: mno.de.pola.trump Partition: 0 Leader: 1017 Replicas: 1017,1018,1016 Isr: 1017,1018,1016 Topic: mno.de.pola.trump Partition: 1 Leader: 1018 Replicas: 1018,1016,1017 Isr: 1018,1017,1016 Topic: mno.de.pola.trump Partition: 2 Leader: 1017 Replicas: 1016,1017,1018 Isr: 1016,1018.1017 Topic: mno.de.pola.trump Partition: 3 Leader: 1017 Replicas: 1017,1016,1018 Isr: 1017,1018,1016 Topic: mno.de.pola.trump Partition: 4 Leader: 1018 Replicas: 1018,1017,1016 Isr: 1018,1017,1016 Topic: mno.de.pola.trump Partition: 5 Leader: 1016 Replicas: 1016,1018,1017 Isr: 1016,1018,1017 Topic: mno.de.pola.trump Partition: 6 Leader: 1017 Replicas: 1017,1018,1016 Isr: 1017,1018,1016
on the following example we can see that brokers are missing from Replica
/usr/hdp/current/kafka-broker/bin/kafka-topics.sh --describe --zookeeper master01:2181 --topic mno.de.pola.trump Topic:mno.de.pola.trump PartitionCount:100 ReplicationFactor:3 Configs: Topic: mno.de.pola.trump Partition: 0 Leader: 1017 Replicas: 1017,1018 Isr: 1017,1018,1016 Topic: mno.de.pola.trump Partition: 1 Leader: 1018 Replicas: 1018,1016,1017 Isr: 1018,1017,1016 Topic: mno.de.pola.trump Partition: 2 Leader: 1017 Replicas: 1016,1017,1018 Isr: 1016,1018,1017 Topic: mno.de.pola.trump Partition: 3 Leader: 1017 Replicas: 1017,1016 Isr: 1017,1018,1016 Topic: mno.de.pola.trump Partition: 4 Leader: 1018 Replicas: 1018,1017,1016 Isr: 1018,1017,1016 Topic: mno.de.pola.trump Partition: 5 Leader: 1016 Replicas: 1016,1018,1017 Isr: 1016,1018,1017 Topic: mno.de.pola.trump Partition: 6 Leader: 1017 Replicas: 1017,1018,1016 Isr: 1017,1018,1016
And so on
.
.
.
so my question - I will happy to know about some tool that can validate the output from kafka-topics.sh --describe
any ideas about this
example of good balanced configuration
/usr/hdp/current/kafka-broker/bin/kafka-topics.sh --describe --zookeeper master01:2181 --topic mno.de.pola.trump Topic: mno.de.pola.trump PartitionCount:100 ReplicationFactor:3 Configs: Topic: mno.de.pola.trump Partition: 0 Leader: 1017 Replicas: 1017,1018,1016 Isr: 1017,1018,1016 Topic: mno.de.pola.trump Partition: 1 Leader: 1018 Replicas: 1018,1016,1017 Isr: 1018,1017,1016 Topic: mno.de.pola.trump Partition: 2 Leader: 1016 Replicas: 1016,1017,1018 Isr: 1016,1018,1017 Topic: mno.de.pola.trump Partition: 3 Leader: 1017 Replicas: 1017,1016,1018 Isr: 1017,1018,1016 Topic: mno.de.pola.trump Partition: 4 Leader: 1018 Replicas: 1018,1017,1016 Isr: 1018,1017,1016 Topic: mno.de.pola.trump Partition: 5 Leader: 1016 Replicas: 1016,1018,1017 Isr: 1016,1018,1017
Created 01-25-2019 03:13 PM
No, unfortunately, I don't have a test cluster, the configuration looks straight forward just create a yaml i.e kafka.yaml file in /etc/kafka_discovery which you export as KAFKA_DISCOVERY_DIR look at the README.md file.
Can you tokenize your sensitive hostname and share the YAML file you created? I am sure we can sort that out I can only spin a single node kafka broker this weekend and test.
Please revert
Created 01-24-2019 06:18 PM
How to Rebalance Topics in a Kafka Cluster
Here is a superb reference by Upendra Mutori to sort out your current issue rebalancing kafka topics
Kafka Monitoring Tools
Any monitoring tools with JMX support should be able to monitor a Kafka cluster. Here are some monitoring tools :
First one is check_kafka.pl from Hari Sekhon. It performs a complete end to end test, i.e. it inserts a message in Kafka as a producer and then extracts it as a consumer. This makes our life easier when measuring service times.
Another useful tool is KafkaOffsetMonitor for monitoring Kafka consumers and their position (offset) in the queue. It aids our understanding of how our queue grows and which consumers groups are lagging behind.
Last but not least, the LinkedIn folks have developed what I think is the smartest tool out there: Burrow. It analyzes consumer offsets and lags over a window of time and determines the consumer status. You can retrieve this status over an HTTP endpoint and then plug it into your favourite monitoring tool (Server Density for example).
There is also Yahoo’s Kafka-Manager. While it does include some basic monitoring, it is more of a management tool. If you are just looking for a Kafka management tool, check out AirBnb’s kafkat.
HTH
Created 01-24-2019 09:07 PM
@Geoffrey Shelton Okot thank you so much for the excellent explanation , these are really great tools and very useful , but as you can understand from my equation I am trying to validate the output , and after I review on all tools they cant validate this simple output , but they have other very useful things ,
Created 01-24-2019 09:54 PM
Sorry, I should have misunderstood you. I thought your primary worry was how to balance the topics, including replica and Isr as you depicted in your --describe examples with missing Isr's if not you want to validate the output kafka-topics.sh --describe against what metrics? Please help me understand your request.
A Kafka administrator's biggest worry is NOT to have the Isr's in sync. I am happy you got some interesting content in this thread
Created 01-24-2019 10:18 PM
the goal of the validation , is to check the output and find problems as leaders is missing or kafka brokers ids are missing from the output or missing brokers ids from Isr , so I just want to check the output and if I will found problems on the topic then we need to do : Kafka reassignment
Created 01-24-2019 10:38 PM
Created 01-25-2019 08:00 AM
I untar the tool kafka-utils-1.8.0.tar.gz
also read the doc - https://kafka-utils.readthedocs.io/en/latest/kafka_check.html
but not succeeded to run the check ,
do you use this tool?
maybe I not configured some files?
do you have example from your ENV ?
Created 01-25-2019 03:13 PM
No, unfortunately, I don't have a test cluster, the configuration looks straight forward just create a yaml i.e kafka.yaml file in /etc/kafka_discovery which you export as KAFKA_DISCOVERY_DIR look at the README.md file.
Can you tokenize your sensitive hostname and share the YAML file you created? I am sure we can sort that out I can only spin a single node kafka broker this weekend and test.
Please revert