Support Questions

Find answers, ask questions, and share your expertise

what is the safe and best way to delete the kafka topic folders

avatar

on all our kafka machines ( production machines ) , we see that: ( no free space )

df -h /var/kafka 

Filesystem      Size  Used Avail Use% Mounted on

/dev/sdb         11T   11T  2.3M 100% /var/kafka

and under /var/kafka/kafka-logs

we see all topic folders (huge size) as example:

117G hgpo.llo.prmt.processed-28 
117G hgpo.llo.prmt.processed-29 
117G hgpo.llo.prmt.processed-3 
117G hgpo.llo.prmt.processed-30 
117G hgpo.llo.prmt.processed-31 
117G hgpo.llo.prmt.processed-32

what is the best approach to delete the topic/s from the folder /var/kafka/kafka-logs ,

and what are the exactly steps to do so , as stop service before deletion etc .

second important question:

what is the mechanizem that suppose to delete automatically the topics ?

Michael-Bronson
1 ACCEPTED SOLUTION

avatar
Expert Contributor
@Michael Bronson

1) To find the consumer-group related to a topic you can use the below script:

for i in `/usr/hdp/current/kafka-broker/bin/zookeeper-shell.sh hdpmaster:2181 ls /consumers 2>&1 | grep consumer | cut -d "[" -f2 | cut -d "]" -f1 | cut -d "," -f1` 
 do 
 /usr/hdp/current/kafka-broker/bin/zookeeper-shell.sh hdpmaster:2181 ls /consumers/$i/offsets 2>&1 | grep test  
if [ $? == 0 ]  
then
 echo $i
 fi  
done

2) Yes, setting the retention.ms using ./bin/kafka-topics.sh --zookeeper localhost:2181 --alter --topic my-topic --config retention.ms=1000

3) as @Jordan Moore mentioned there is no automation to delete the topic which is in built to Kafka (we have to follow the process defined above)

4) when I delete a topic we get - Topic <....> is marked for deletion. , is it mean that it will take time until topic will be deleted ?

Yes, it does take time to delete a topic and you can also this message if the delete.topic.enable is set to false (it is the default setting) as given previously:Note: to delete a kafka topic we need to set delete.topic.enable=true it requires kafka service to be restarted

View solution in original post

12 REPLIES 12

avatar
Expert Contributor
@Michael Bronson

Below is the process you can follow to delete the kafka topic and the corresponding directory (i.e. /var/log/kafka-logs)

1) Make sure to set the message retention time of a topic to 1000ms (1s) to stop the inflow --> using retention.ms (it does take some time to delete all logs to free up log segments

2) Once it free up the space then we can delete the topic Note: to delete a kafka topic we need to set delete.topic.enable=true it requires kafka service to be restarted, which is disabled by default

3) Now we can delete the kafka topic using:

 /usr/hdp/current/kafka-broker/bin/kafka-topics.sh --delete  --zookeeper <zk-ensemble>:2181  --topic <kafka-topic-name>

4) once done now we need to delete the corresponding consumer offset from ZooKeeper using:

/usr/hdp/current/zookeeper-client/bin/zkCli.sh -server <zk-ensemble>:2181

[zk: <zk-ensemble>:2181(CONNECTED) 1] 

rmr /consumers/<consumer-of-topic>/offsets/<kafka-topic-to-delete>

5) Now we can remove the corresponding topic directory from all the broker nodes

6) with this we can complete the process of the deleting the topic and the corresponding folders

can you please give more in relation second question what is the mechanizem that suppose to delete automatically the topics ?

Thanks

Venkat

avatar

what is the - consumer-of-topic ? ( how to find it ? ) , for example if the topic name is - hgpo.llo.prmt.processed-32 , then what is the consumer-of-topic ?

Michael-Bronson

avatar

can you give more details about the "1) Make sure to set the message retention time of a topic to 1000ms (1s) to stop the inflow --> using retention.ms (it does take some time to delete all logs to free up log segments" , can you please described it step by step , or maybe do you mean that I need to run the CLI as : ./bin/kafka-topics.sh --zookeeper localhost:2181 --alter --topic my-topic --config retention.ms=1000 ??

Michael-Bronson

avatar

about my question - what is the mechanizem that suppose to delete automatically the topics ? , I not sure about it but what I am asking is about auto deletion process that remove the topic from the folder , is it truely defined somewhwhere ? )

Michael-Bronson

avatar
Super Collaborator

@Michael Bronson Topics are never automatically deleted. The logs are retained for a configured number of bytes (log.retention.bytes) or period of time (log.retention.{hours, minutes, ms}), then the log segments are purged or compacted, which is another Kafka setting (log.cleanup.policy).

All the configurations that you seek are defined in the Kafka documentation, and you should really take these tunables into consideration when installing a production Kafka cluster.

avatar

just one note , when I delete a topic we get - Topic <....> is marked for deletion. , is it mean that it will take time until topic will be deleted ?

Michael-Bronson

avatar
Expert Contributor
@Michael Bronson

1) To find the consumer-group related to a topic you can use the below script:

for i in `/usr/hdp/current/kafka-broker/bin/zookeeper-shell.sh hdpmaster:2181 ls /consumers 2>&1 | grep consumer | cut -d "[" -f2 | cut -d "]" -f1 | cut -d "," -f1` 
 do 
 /usr/hdp/current/kafka-broker/bin/zookeeper-shell.sh hdpmaster:2181 ls /consumers/$i/offsets 2>&1 | grep test  
if [ $? == 0 ]  
then
 echo $i
 fi  
done

2) Yes, setting the retention.ms using ./bin/kafka-topics.sh --zookeeper localhost:2181 --alter --topic my-topic --config retention.ms=1000

3) as @Jordan Moore mentioned there is no automation to delete the topic which is in built to Kafka (we have to follow the process defined above)

4) when I delete a topic we get - Topic <....> is marked for deletion. , is it mean that it will take time until topic will be deleted ?

Yes, it does take time to delete a topic and you can also this message if the delete.topic.enable is set to false (it is the default setting) as given previously:Note: to delete a kafka topic we need to set delete.topic.enable=true it requires kafka service to be restarted

avatar

@Venkata thank you , I run the script with the topic-name (hgpo.llo.prmt.processed) instead test , and consumer-group not found , I mean no output from the script , is is possible ?

second as you know we get Topic mcapi.avro.pri.processed is already marked for deletion after we run "delete kafka topic" and we run it yesterday - why it take so along time ? , ( in my kafka - delete.topic.enable=true is set by default)

so in that case can we remove the topic under /var/kafka/kafka-logs by rm -rf "topic name " ?

as

rm -rf hgpo.llo.prmt.processed-28

rm -rf hgpo.llo.prmt.processed-29

and so on for all other topics

  1. 117G hgpo.llo.prmt.processed-3
  2. 117G hgpo.llo.prmt.processed-30
  3. 117G hgpo.llo.prmt.processed-31
  4. 117G hgpo.llo.prmt.processed-32
  5. .
  6. .
  7. .
Michael-Bronson

avatar
Rising Star

@Michael Bronson , i changed the script, since , wasnt parsinf the consumer ( btw . grear script - thanks)

topico="entrada"
for i in `/usr/hdp/current/kafka-broker/bin/zookeeper-shell.sh sr-hadctl-xt01:2181 ls /consumers 2>&1 | grep consumer | cut -d "[" -f2 | cut -d "]" -f1 | tr ',' "\n"`
do
/usr/hdp/current/kafka-broker/bin/zookeeper-shell.sh sr-hadctl-xt01:2181 ls /consumers/$i/offsets 2>&1 | grep $topico
if [ $? == 0 ]
then
echo $i
fi
done