Support Questions
Find answers, ask questions, and share your expertise

Avoid kafka disk to became 100% used by Cron job

We want to suggest the following based on our issues on kafka disks

We have many HDP clusters ( based on ambari , and all machines are redhat version 7.2 )

Each cluster include 3 kafka machines , while each kafka include disk with ~15 T

Because we have many issues that disk increased to 100% used capacity ( kafka Retention from some reason not works as should be )

Then we think about cron job that will run on kafka machines every min

And when kafka disk size will be for example - ~90%

then cron job will stop all kafka brokers ( kafka service )

And by this we avoid the kafka disk to became 100% , ( as all know when disk is 100% then the purging process will not works )

Please share your opinion



Hi @Michael Bronson ,

any insights into why you are thinking "retention does not work as it should" ?

It would be also helpful if you could provide some more details about the usage of your Kafka Cluster. Is data flodding in steadily, are there heavy spikes which lead to _partition full_, how many producers in parallel, how many topics + replication, etc.

How did you configure the retention?

Regards, Gerd

here are the details

kafka retention hours - 7 days

kafka retention bytes - 130G ( I convert it to 130G )

; ;