Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Kafka does not delete expired data automatically.

avatar
New Member

HI,

The version of HDP is HDP-2.4.2.0.

Below is kafka setting list:

log.roll.hours:96

log.retention.hours:48

log.cleanup.interval.mins:10

log.retention.bytes:2199023255552

offsets.retention.check.interval.ms:600000

offsets.retention.minutes:86400000

log.retention.check.interval.ms:300000

But I still could find data which were generated before two days. And no data is deleted automatically.

Anyone could help me ?

1 ACCEPTED SOLUTION

avatar
New Member

Finally this problem is found by checking the log file's time. the log file's create_time will be changed to current time when we restarted the Kafka service. the time which is used to define expired time based on the create time's setting.

For example: I restart Kafka all services at 5 PM, the setting of expired time is set to 2 hours. the data will be started to delete at 7 PM.

View solution in original post

2 REPLIES 2

avatar
New Member

Kafka cluster has thee node, one of data node size is more than 3.9T. I have set the parameter is 2T(log.retention.bytes:2199023255552) for all kafka log data, which setting does not take effect.

avatar
New Member

Finally this problem is found by checking the log file's time. the log file's create_time will be changed to current time when we restarted the Kafka service. the time which is used to define expired time based on the create time's setting.

For example: I restart Kafka all services at 5 PM, the setting of expired time is set to 2 hours. the data will be started to delete at 7 PM.