Support Questions

Find answers, ask questions, and share your expertise

kafka-topics.sh --describe don't return anything

avatar
New Contributor

I am running a kafka cluster composed by 3 nodes. One of the nodes crashed and it has been behaving oddly since then...

The following does not return anything on the malfunctioning node:

kafka-topics.sh --describe --zookeeper mynode01:2181 

However, querying the topics on the other nodes return the expected topics.

Another thing I saw is that zookeeper seems to be missing some directories: .

/zkCli.sh -server mynode01 
[zk: localhost:2181(CONNECTED) 1] ls / 
[controller, zookeeper] 

Whereas if I check any other node it comes back with:

[zk: localhost:2181(CONNECTED) 0] ls /
[isr_change_notification, zookeeper, admin, consumers, config, controller, brokers] 

The logs report the following entry:

Error for partition [myqueue-1,0] to broker 1:org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. (kafka.server.ReplicaFetcherThread)

I tried a couple of things already to sort this out, with no joy:

  1. Restart the kafka cluster, so that other node becomes leader.
  2. Assign a different leader for the topics affected by running ./kafka-reassign-partitions.sh
  3. Stop kafka and zookeeper services on the affected node, remove kafka-logs and zkdata and start them back up.

Although the cluster seems to be able to treat this node as any other and switch the roles of leader/follower with no issues... it looks like it got out of sync at some point and is not able to recover itself.

Any idea? Thanks in advance

1 ACCEPTED SOLUTION

avatar
New Contributor

I was able to solve the issue by stopping zookeeper and kafka services in the affected node and removing the snapshots available in zkdata and the associated transaction logs available in zklog directories.

After starting zookeeper back up on the the affected node, the znodes missing were re-synced back.

Thanks for the help provided 🙂

View solution in original post

6 REPLIES 6

avatar
Expert Contributor

It looks to me that the content in the zookeeper is not synchronized, as you can not get the updated information from zookeeper on that node. it might need to fix the zookeeper first.

avatar
New Contributor

Hi Frank, Thanks for your reply. Yes, that makes sense... would you be able to suggest something I could try to re-sync zookeeper?

avatar
Expert Contributor

Can you check with the zookeeper log, to see what messages it reported.

avatar
Expert Contributor

@yeayu Have you solved the issue? If not, could you please share the zookeeper log, so that I can take a look at what is the problem might be.

avatar
New Contributor

I was able to solve the issue by stopping zookeeper and kafka services in the affected node and removing the snapshots available in zkdata and the associated transaction logs available in zklog directories.

After starting zookeeper back up on the the affected node, the znodes missing were re-synced back.

Thanks for the help provided 🙂

avatar
Expert Contributor

Great to hear that you solve the issue.