Support Questions

Find answers, ask questions, and share your expertise

How to take Kafka Topic Backup and restore?

avatar

We need to take backup all the topics in Kafka to the file named in respective topic names and need to restore the topic as per user requirement. Note: This script needs to be run in the Kerberized environment.

kafkabackup.sh

Making required directories

monyear=`date | awk '{print $2$6}'`
dat=`date| awk '{print $2$3$6}'`
export BACKUPDIR=/root/backup/$monyear
mkdir -p $BACKUPDIR
mkdir -p $BACKUPDIR/$dat
cd $BACKUPSDIR
BKDIR=$BACKUPDIR/$dat

Log into Kafka

Get topics from Kafka Broker

kinit -kt /etc/security/keytabs/kafka.service.keytab kafka/node1.localdomaino@domain.co
cd /usr/hdp/current/kafka-broker/bin/
export KAFKA_CLIENT_KERBEROS_PARAMS="-Djava.security.auth.login.config=/etc/kafka/conf/kafka_client_jaas.conf"
./kafka-topics.sh --zookeeper adminnode.localdomain:2181 --list > $BKDIR/listtopics.txt

Remove if any mark of deletion topics exists

sed -i.bak '/deletion/d' $BKDIR/listtopics.txt

Starting kill script in parallel

bash checkandkill.sh& 

Reading the file contents for topics

for line in $(cat $BKDIR/listtopics.txt)
do
    echo $line
    ./test.sh --bootstrap-server node1.localdomain:6668 --topic $line  --consumer.config /home/kafka/conf.properties --from-beginning --security-protocol SASL_SSL > $BKDIR/$line
done

Delete empty files

/usr/bin/find . -size 0 -delete

Killing checkandkill daemon and exit

ps -ef |grep -i checkandkill.sh| grep -v grep | awk '{print $2}' | xargs kill
exit

When consumer runs, it constantly waits for messages to receive. We need to kill the process.

checkandkill.sh

sleep 0.5m
for line in $(cat /root/backup/listtopics.txt)
do
    echo $line
    sleep 1m
    ps -ef |grep -i $line| grep -v grep | awk '{print $2}' | xargs kill
done

Need your help to complete restoration script.

1 REPLY 1

avatar

Hi @Krishnaraj V!

I'm not sure if I get it, but are you trying to kill the consumer that you're creating right?

If so, then I'd try to grep the kafka.tools.ConsoleConsumer + the topic name.
Would be like:

sleep 0.5m
for line in $(cat /root/backup/listtopics.txt)
do
echo $line
sleep 1m
ps -ef | grep -i kafka.tools.ConsoleConsumer | grep -i $line| grep -v grep | awk '{print $2}' | xargs kill - 9
done

Hope this helps!