Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

validate Kafka reassignment for partitions

avatar

hi all

I want to validate the balanced of the topics , include replica and Isr

I will give some example of wrong balanced ( master01 is the zookeper server , and we defined 3 replicate )

on that example we can see that kafka brokers are missing from some Isr

/usr/hdp/current/kafka-broker/bin/kafka-topics.sh  --describe --zookeeper  master01:2181 --topic mno.de.pola.trump  Topic:mno.de.pola.trump  PartitionCount:100  ReplicationFactor:3  Configs:  Topic:
mno.de.pola.trump  Partition: 0  Leader: 1017  Replicas: 1017,1018,1016   Isr: 1017,1018,1016  Topic:
mno.de.pola.trump  Partition: 1  Leader: 1018  Replicas: 1018,1016,1017   Isr: 1018,1017,1016  Topic:
mno.de.pola.trump  Partition: 2  Leader: 1016  Replicas: 1016,1017,1018   Isr: 1016,1018  Topic:
mno.de.pola.trump  Partition: 3  Leader: 1017  Replicas: 1017,1016,1018   Isr: 1017,1018,1016  Topic:
mno.de.pola.trump  Partition: 4  Leader: 1018  Replicas: 1018,1017,1016   Isr: 1018,1017,1016  Topic:
mno.de.pola.trump  Partition: 5  Leader: 1016  Replicas: 1016,1018,1017   Isr: 1016,1018,1017  Topic:
mno.de.pola.trump  Partition: 6  Leader: 1017  Replicas: 1017,1018,1016   Isr: 1017,1018

on the following example we can see that Leader on partitions are more on broker 1017

/usr/hdp/current/kafka-broker/bin/kafka-topics.sh  --describe --zookeeper  master01:2181 --topic mno.de.pola.trump  Topic:mno.de.pola.trump   PartitionCount:100  ReplicationFactor:3  Configs:  Topic:
mno.de.pola.trump  Partition: 0  Leader: 1017  Replicas: 1017,1018,1016  Isr: 1017,1018,1016  Topic:
mno.de.pola.trump  Partition: 1  Leader: 1018  Replicas: 1018,1016,1017  Isr: 1018,1017,1016  Topic:
mno.de.pola.trump  Partition: 2  Leader: 1017  Replicas: 1016,1017,1018  Isr: 1016,1018.1017  Topic:
mno.de.pola.trump  Partition: 3  Leader: 1017  Replicas: 1017,1016,1018  Isr: 1017,1018,1016  Topic:
mno.de.pola.trump  Partition: 4  Leader: 1018  Replicas: 1018,1017,1016  Isr: 1018,1017,1016  Topic:
mno.de.pola.trump  Partition: 5  Leader: 1016  Replicas: 1016,1018,1017  Isr: 1016,1018,1017  Topic:
mno.de.pola.trump  Partition: 6  Leader: 1017  Replicas: 1017,1018,1016  Isr: 1017,1018,1016

on the following example we can see that brokers are missing from Replica

/usr/hdp/current/kafka-broker/bin/kafka-topics.sh  --describe --zookeeper  master01:2181 --topic mno.de.pola.trump  Topic:mno.de.pola.trump  PartitionCount:100  ReplicationFactor:3  Configs:  Topic:
mno.de.pola.trump  Partition: 0  Leader: 1017  Replicas: 1017,1018       Isr: 1017,1018,1016  Topic:
mno.de.pola.trump  Partition: 1  Leader: 1018  Replicas: 1018,1016,1017  Isr: 1018,1017,1016  Topic:
mno.de.pola.trump  Partition: 2  Leader: 1017  Replicas: 1016,1017,1018  Isr: 1016,1018,1017  Topic:
mno.de.pola.trump  Partition: 3  Leader: 1017  Replicas: 1017,1016       Isr: 1017,1018,1016  Topic:
mno.de.pola.trump  Partition: 4  Leader: 1018  Replicas: 1018,1017,1016  Isr: 1018,1017,1016  Topic:
mno.de.pola.trump  Partition: 5  Leader: 1016  Replicas: 1016,1018,1017  Isr: 1016,1018,1017  Topic:
mno.de.pola.trump  Partition: 6  Leader: 1017 Replicas:  1017,1018,1016  Isr: 1017,1018,1016

And so on

.

.

.

so my question - I will happy to know about some tool that can validate the output from kafka-topics.sh --describe

any ideas about this

example of good balanced configuration

/usr/hdp/current/kafka-broker/bin/kafka-topics.sh  --describe --zookeeper  master01:2181 --topic mno.de.pola.trump
Topic: mno.de.pola.trump      PartitionCount:100      ReplicationFactor:3     Configs:
Topic: mno.de.pola.trump     Partition: 0    Leader: 1017    Replicas: 1017,1018,1016        Isr: 1017,1018,1016
Topic: mno.de.pola.trump     Partition: 1    Leader: 1018    Replicas: 1018,1016,1017        Isr: 1018,1017,1016
Topic: mno.de.pola.trump     Partition: 2    Leader: 1016    Replicas: 1016,1017,1018        Isr: 1016,1018,1017
Topic: mno.de.pola.trump     Partition: 3    Leader: 1017    Replicas: 1017,1016,1018        Isr: 1017,1018,1016
Topic: mno.de.pola.trump     Partition: 4    Leader: 1018    Replicas: 1018,1017,1016        Isr: 1018,1017,1016
Topic: mno.de.pola.trump     Partition: 5    Leader: 1016    Replicas: 1016,1018,1017        Isr: 1016,1018,1017
Michael-Bronson
1 ACCEPTED SOLUTION

avatar
Master Mentor

@Michael Bronson

No, unfortunately, I don't have a test cluster, the configuration looks straight forward just create a yaml i.e kafka.yaml file in /etc/kafka_discovery which you export as KAFKA_DISCOVERY_DIR look at the README.md file.

Can you tokenize your sensitive hostname and share the YAML file you created? I am sure we can sort that out I can only spin a single node kafka broker this weekend and test.

Please revert

View solution in original post

7 REPLIES 7

avatar
Master Mentor

@Michael Bronson

How to Rebalance Topics in a Kafka Cluster

Here is a superb reference by Upendra Mutori to sort out your current issue rebalancing kafka topics

Kafka Monitoring Tools

Any monitoring tools with JMX support should be able to monitor a Kafka cluster. Here are some monitoring tools :

First one is check_kafka.pl from Hari Sekhon. It performs a complete end to end test, i.e. it inserts a message in Kafka as a producer and then extracts it as a consumer. This makes our life easier when measuring service times.

Another useful tool is KafkaOffsetMonitor for monitoring Kafka consumers and their position (offset) in the queue. It aids our understanding of how our queue grows and which consumers groups are lagging behind.

Last but not least, the LinkedIn folks have developed what I think is the smartest tool out there: Burrow. It analyzes consumer offsets and lags over a window of time and determines the consumer status. You can retrieve this status over an HTTP endpoint and then plug it into your favourite monitoring tool (Server Density for example).

There is also Yahoo’s Kafka-Manager. While it does include some basic monitoring, it is more of a management tool. If you are just looking for a Kafka management tool, check out AirBnb’s kafkat.

HTH

avatar

@Geoffrey Shelton Okot thank you so much for the excellent explanation , these are really great tools and very useful , but as you can understand from my equation I am trying to validate the output , and after I review on all tools they cant validate this simple output , but they have other very useful things ,

Michael-Bronson

avatar
Master Mentor

@Michael Bronson

Sorry, I should have misunderstood you. I thought your primary worry was how to balance the topics, including replica and Isr as you depicted in your --describe examples with missing Isr's if not you want to validate the output kafka-topics.sh --describe against what metrics? Please help me understand your request.

A Kafka administrator's biggest worry is NOT to have the Isr's in sync. I am happy you got some interesting content in this thread

avatar

the goal of the validation , is to check the output and find problems as leaders is missing or kafka brokers ids are missing from the output or missing brokers ids from Isr , so I just want to check the output and if I will found problems on the topic then we need to do : Kafka reassignment

Michael-Bronson

avatar
Master Mentor

@Michael Bronson

Have a look at these 2 tools

Kafka check

Kafka Tool

Happy hadooping

avatar

@Geoffrey Shelton Okot

I untar the tool kafka-utils-1.8.0.tar.gz

also read the doc - https://kafka-utils.readthedocs.io/en/latest/kafka_check.html

but not succeeded to run the check ,

do you use this tool?

maybe I not configured some files?

do you have example from your ENV ?

Michael-Bronson

avatar
Master Mentor

@Michael Bronson

No, unfortunately, I don't have a test cluster, the configuration looks straight forward just create a yaml i.e kafka.yaml file in /etc/kafka_discovery which you export as KAFKA_DISCOVERY_DIR look at the README.md file.

Can you tokenize your sensitive hostname and share the YAML file you created? I am sure we can sort that out I can only spin a single node kafka broker this weekend and test.

Please revert