Support Questions

Find answers, ask questions, and share your expertise

Kafka does not delete expired data automatically.

avatar
Explorer

HI,

The version of HDP is HDP-2.4.2.0.

Below is kafka setting list:

log.roll.hours:96

log.retention.hours:48

log.cleanup.interval.mins:10

log.retention.bytes:2199023255552

offsets.retention.check.interval.ms:600000

offsets.retention.minutes:86400000

log.retention.check.interval.ms:300000

But I still could find data which were generated before two days. And no data is deleted automatically.

Anyone could help me ?

1 ACCEPTED SOLUTION

avatar
Explorer

Finally this problem is found by checking the log file's time. the log file's create_time will be changed to current time when we restarted the Kafka service. the time which is used to define expired time based on the create time's setting.

For example: I restart Kafka all services at 5 PM, the setting of expired time is set to 2 hours. the data will be started to delete at 7 PM.

View solution in original post

2 REPLIES 2

avatar
Explorer

Kafka cluster has thee node, one of data node size is more than 3.9T. I have set the parameter is 2T(log.retention.bytes:2199023255552) for all kafka log data, which setting does not take effect.

avatar
Explorer

Finally this problem is found by checking the log file's time. the log file's create_time will be changed to current time when we restarted the Kafka service. the time which is used to define expired time based on the create time's setting.

For example: I restart Kafka all services at 5 PM, the setting of expired time is set to 2 hours. the data will be started to delete at 7 PM.