Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

is it possible to purge topic when kafka broker is down

avatar

we have hadoop cluster with 3 kafka machines we want to purge all the topics in the kafka

as the following

/usr/hdp/2.6.0.3-8/kafka/bin/kafka-topics.sh --zookeeper master:2181 --alter --topic Topic_Name --config retention.ms=1000

the problem is that two of the kafka machines have a problem that kafka broker on kafka01/02 restarting all the time , or kafka broker is down on kadka01/03

so my question is: can we purge Topics in spite kafka broker is down ?

Michael-Bronson
10 REPLIES 10

avatar
Master Mentor

@Michael Bronson

I highly doubt whether you can delete or purge the topics when the broker is down

PURGE topic

You can drain the topic by expiring the messages

./kafka-topics --zookeeper {ZKADR} --alter --topic topic_name --config retention.ms=1000
./kafka-topics --zookeeper {ZKADR} --alter --topic topic_name --delete-config retention.ms

How to delete data from topic

To delete manually:

  • Shutdown the cluster
  • Clean kafka log dir (specified by the log.dir attribute in kafka config file ) as well the zookeeper data
  • Restart the cluster

For any given topic what you can do is

  • Stop kafka
  • Clean kafka log specific to partition, kafka stores its log file in a format of "logDir/topic-partition" so for a topic named "MyTopic" the log for partition id 0 will be stored in /tmp/kafka-logs/MyTopic-0 where /tmp/kafka-logs is specified by the log.dir attribute
  • Restart kafka

Hope that helps

avatar

regarding "I highly doubt whether you can delete or purge the topics when the broker is down" , so what we can do ?

we cant fixed the kafka broker restart , how we can be sure the purge will do the job inspite broker is down?

Michael-Bronson

avatar

the reason that we want to purge all topic is because the restart of the kafka broker , many indexes are corupted and maybe log files

Michael-Bronson

avatar

any option to hold the kafka to stay up ?

Michael-Bronson

avatar

@Geoffrey you know any check that fives ok/fail after : or tell us if purge success?

  1. /usr/hdp/2.6.0.3-8/kafka/bin/kafka-topics.sh --zookeeper master:2181--alter --topic Topic_Name--config retention.ms=1000
Michael-Bronson

avatar

just one important note - we have 3 kafka kafka01/03 are have the problem with broker restart but not on kafka02 , so my quastion is can we purge on kafka02 , and this will efected also kafka01/03

Michael-Bronson

avatar
Master Mentor

@Michael Bronson

The manual delete should take care of broker that's down that because you as reiterated you have to shut down the cluster (broker down)

Now about your last question with worries about Kafka02, if you know how Kafka stores partitions then you will realize the for example you have 3 brokers and created a topic with 6 partitions with replication factor 1.

Each cluster will be responsible for 2 partitions. The replication-factor has been set to 1, which means data is not being replicated and the data for a particular partition will only be stored on one server, so here the key is in the replication factor!

Whats the retention policy? That also plays a role as to whether you consume restart consuming from the beginning!

You could add new Kafka Brokers to a cluster and move existing topics to new Brokers with all the topics intact? see this HCC Kafka document

You have to weigh your options to avoid data loss, the reason a kafka deployment should be well thought through DR strategies

avatar

regarding to "How to delete data from topic" can we get step by step procedure . as you know we not want to delete the Topic or Topic partitions ,

Michael-Bronson

avatar

regardint "

  • Clean kafka log dir (specified by the log.dir attribute in kafka config file ) as well the zookeeper data"

do you mean to delete all 00000000000000000000.index 00000000000000000000.log 00000000000000000000.timeindex from /var/kafka/kafka-logs/ ?

Michael-Bronson