- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Kafka does not delete expired data automatically.
- Labels:
-
Apache Kafka
Created ‎03-01-2017 03:19 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
HI,
The version of HDP is HDP-2.4.2.0.
Below is kafka setting list:
log.roll.hours:96
log.retention.hours:48
log.cleanup.interval.mins:10
log.retention.bytes:2199023255552
offsets.retention.check.interval.ms:600000
offsets.retention.minutes:86400000
log.retention.check.interval.ms:300000
But I still could find data which were generated before two days. And no data is deleted automatically.
Anyone could help me ?
Created ‎03-01-2017 08:27 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Finally this problem is found by checking the log file's time. the log file's create_time will be changed to current time when we restarted the Kafka service. the time which is used to define expired time based on the create time's setting.
For example: I restart Kafka all services at 5 PM, the setting of expired time is set to 2 hours. the data will be started to delete at 7 PM.
Created ‎03-01-2017 03:25 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Kafka cluster has thee node, one of data node size is more than 3.9T. I have set the parameter is 2T(log.retention.bytes:2199023255552) for all kafka log data, which setting does not take effect.
Created ‎03-01-2017 08:27 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Finally this problem is found by checking the log file's time. the log file's create_time will be changed to current time when we restarted the Kafka service. the time which is used to define expired time based on the create time's setting.
For example: I restart Kafka all services at 5 PM, the setting of expired time is set to 2 hours. the data will be started to delete at 7 PM.
