Created 11-30-2017 02:29 PM
on all our kafka machines ( production machines ) , we see that: ( no free space )
df -h /var/kafka Filesystem Size Used Avail Use% Mounted on /dev/sdb 11T 11T 2.3M 100% /var/kafka
and under /var/kafka/kafka-logs
we see all topic folders (huge size) as example:
117G hgpo.llo.prmt.processed-28 117G hgpo.llo.prmt.processed-29 117G hgpo.llo.prmt.processed-3 117G hgpo.llo.prmt.processed-30 117G hgpo.llo.prmt.processed-31 117G hgpo.llo.prmt.processed-32
what is the best approach to delete the topic/s from the folder /var/kafka/kafka-logs ,
and what are the exactly steps to do so , as stop service before deletion etc .
second important question:
what is the mechanizem that suppose to delete automatically the topics ?
Created 12-01-2017 07:15 AM
1) To find the consumer-group related to a topic you can use the below script:
for i in `/usr/hdp/current/kafka-broker/bin/zookeeper-shell.sh hdpmaster:2181 ls /consumers 2>&1 | grep consumer | cut -d "[" -f2 | cut -d "]" -f1 | cut -d "," -f1` do /usr/hdp/current/kafka-broker/bin/zookeeper-shell.sh hdpmaster:2181 ls /consumers/$i/offsets 2>&1 | grep test if [ $? == 0 ] then echo $i fi done
2) Yes, setting the retention.ms using ./bin/kafka-topics.sh --zookeeper localhost:2181 --alter --topic my-topic --config retention.ms=1000
3) as @Jordan Moore mentioned there is no automation to delete the topic which is in built to Kafka (we have to follow the process defined above)
4) when I delete a topic we get - Topic <....> is marked for deletion. , is it mean that it will take time until topic will be deleted ?
Yes, it does take time to delete a topic and you can also this message if the delete.topic.enable is set to false (it is the default setting) as given previously:Note: to delete a kafka topic we need to set delete.topic.enable=true it requires kafka service to be restarted
Created 11-30-2017 03:47 PM
Below is the process you can follow to delete the kafka topic and the corresponding directory (i.e. /var/log/kafka-logs)
1) Make sure to set the message retention time of a topic to 1000ms (1s) to stop the inflow --> using retention.ms (it does take some time to delete all logs to free up log segments
2) Once it free up the space then we can delete the topic Note: to delete a kafka topic we need to set delete.topic.enable=true it requires kafka service to be restarted, which is disabled by default
3) Now we can delete the kafka topic using:
/usr/hdp/current/kafka-broker/bin/kafka-topics.sh --delete --zookeeper <zk-ensemble>:2181 --topic <kafka-topic-name>
4) once done now we need to delete the corresponding consumer offset from ZooKeeper using:
/usr/hdp/current/zookeeper-client/bin/zkCli.sh -server <zk-ensemble>:2181 [zk: <zk-ensemble>:2181(CONNECTED) 1] rmr /consumers/<consumer-of-topic>/offsets/<kafka-topic-to-delete>
5) Now we can remove the corresponding topic directory from all the broker nodes
6) with this we can complete the process of the deleting the topic and the corresponding folders
can you please give more in relation second question what is the mechanizem that suppose to delete automatically the topics ?
Thanks
Venkat
Created 11-30-2017 04:10 PM
what is the - consumer-of-topic ? ( how to find it ? ) , for example if the topic name is - hgpo.llo.prmt.processed-32 , then what is the consumer-of-topic ?
Created 11-30-2017 05:14 PM
can you give more details about the "1) Make sure to set the message retention time of a topic to 1000ms (1s) to stop the inflow --> using retention.ms (it does take some time to delete all logs to free up log segments" , can you please described it step by step , or maybe do you mean that I need to run the CLI as : ./bin/kafka-topics.sh --zookeeper localhost:2181 --alter --topic my-topic --config retention.ms=1000 ??
Created 11-30-2017 05:20 PM
about my question - what is the mechanizem that suppose to delete automatically the topics ? , I not sure about it but what I am asking is about auto deletion process that remove the topic from the folder , is it truely defined somewhwhere ? )
Created 11-30-2017 09:06 PM
@Michael Bronson Topics are never automatically deleted. The logs are retained for a configured number of bytes (log.retention.bytes) or period of time (log.retention.{hours, minutes, ms}), then the log segments are purged or compacted, which is another Kafka setting (log.cleanup.policy).
All the configurations that you seek are defined in the Kafka documentation, and you should really take these tunables into consideration when installing a production Kafka cluster.
Created 11-30-2017 09:36 PM
just one note , when I delete a topic we get - Topic <....> is marked for deletion. , is it mean that it will take time until topic will be deleted ?
Created 12-01-2017 07:15 AM
1) To find the consumer-group related to a topic you can use the below script:
for i in `/usr/hdp/current/kafka-broker/bin/zookeeper-shell.sh hdpmaster:2181 ls /consumers 2>&1 | grep consumer | cut -d "[" -f2 | cut -d "]" -f1 | cut -d "," -f1` do /usr/hdp/current/kafka-broker/bin/zookeeper-shell.sh hdpmaster:2181 ls /consumers/$i/offsets 2>&1 | grep test if [ $? == 0 ] then echo $i fi done
2) Yes, setting the retention.ms using ./bin/kafka-topics.sh --zookeeper localhost:2181 --alter --topic my-topic --config retention.ms=1000
3) as @Jordan Moore mentioned there is no automation to delete the topic which is in built to Kafka (we have to follow the process defined above)
4) when I delete a topic we get - Topic <....> is marked for deletion. , is it mean that it will take time until topic will be deleted ?
Yes, it does take time to delete a topic and you can also this message if the delete.topic.enable is set to false (it is the default setting) as given previously:Note: to delete a kafka topic we need to set delete.topic.enable=true it requires kafka service to be restarted
Created 12-01-2017 08:31 AM
@Venkata thank you , I run the script with the topic-name (hgpo.llo.prmt.processed) instead test , and consumer-group not found , I mean no output from the script , is is possible ?
second as you know we get Topic mcapi.avro.pri.processed is already marked for deletion after we run "delete kafka topic" and we run it yesterday - why it take so along time ? , ( in my kafka - delete.topic.enable=true is set by default)
so in that case can we remove the topic under /var/kafka/kafka-logs by rm -rf "topic name " ?
as
rm -rf hgpo.llo.prmt.processed-28
rm -rf hgpo.llo.prmt.processed-29
and so on for all other topics
Created 01-30-2018 03:49 PM
@Michael Bronson , i changed the script, since , wasnt parsinf the consumer ( btw . grear script - thanks)
topico="entrada"
for i in `/usr/hdp/current/kafka-broker/bin/zookeeper-shell.sh sr-hadctl-xt01:2181 ls /consumers 2>&1 | grep consumer | cut -d "[" -f2 | cut -d "]" -f1 | tr ',' "\n"`
do
/usr/hdp/current/kafka-broker/bin/zookeeper-shell.sh sr-hadctl-xt01:2181 ls /consumers/$i/offsets 2>&1 | grep $topico
if [ $? == 0 ]
then
echo $i
fi
done