Support Questions
Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

Kafka does not delete expired data automatically.

Explorer

HI,

The version of HDP is HDP-2.4.2.0.

Below is kafka setting list:

log.roll.hours:96

log.retention.hours:48

log.cleanup.interval.mins:10

log.retention.bytes:2199023255552

offsets.retention.check.interval.ms:600000

offsets.retention.minutes:86400000

log.retention.check.interval.ms:300000

But I still could find data which were generated before two days. And no data is deleted automatically.

Anyone could help me ?

1 ACCEPTED SOLUTION

Explorer

Finally this problem is found by checking the log file's time. the log file's create_time will be changed to current time when we restarted the Kafka service. the time which is used to define expired time based on the create time's setting.

For example: I restart Kafka all services at 5 PM, the setting of expired time is set to 2 hours. the data will be started to delete at 7 PM.

View solution in original post

2 REPLIES 2

Explorer

Kafka cluster has thee node, one of data node size is more than 3.9T. I have set the parameter is 2T(log.retention.bytes:2199023255552) for all kafka log data, which setting does not take effect.

Explorer

Finally this problem is found by checking the log file's time. the log file's create_time will be changed to current time when we restarted the Kafka service. the time which is used to define expired time based on the create time's setting.

For example: I restart Kafka all services at 5 PM, the setting of expired time is set to 2 hours. the data will be started to delete at 7 PM.