Created on 07-18-2017 11:20 PM - edited 09-16-2022 04:57 AM
Hi,
We are using Flume file channel and it keeps running out of space in few days. How do we manage the data directory size specific to a file channel. Is there some property we can use to purge old data?
I don't want to perform a manual clean-up of the file channel directories and would prefer something that will be part of the Flume Agent configuration itself.
Please advise.
Thanks,
Mari
Created 09-15-2017 06:02 AM
Created 09-15-2017 09:00 AM
Created 09-15-2017 09:16 AM
Created 09-19-2017 02:33 AM
Hi,
We need only 1 month old files. The rest dataFiles should be deleted on daily basis.
What is proper solution.
Created 09-19-2017 01:41 PM
Created on 09-20-2017 04:06 AM - edited 09-20-2017 04:08 AM
Not sure you get me right...
Let me describe one more time:
1. We configured Flume agents(Cludera->Flume01->instance->configuration):
source.type=avro
source.bind=<our_hostname>
source.port=<our_port>
source.interceptor=timestamp_interceptor
source.interceptor.timestamp_interceptor=timestamp
channels.type=memory
channels.capacity=10000
channels..transactionCapacity=10000
sinks.type=hdfs
hdfs.path=/client/project/log/%Y-%m-%d
hdfs.fileType=DataStream
hdfs.rollSize=0
hdfs.rollCount=0
hdfs.rollInterval=0
hdfs.batchSize=100
So our application logs looks like:
>hdfs dfs -ls /client/project/log/
/client/project/log/2017-07-06
/client/project/log/2017-07-07
/client/project/log/2017-07-08
/client/project/log/2017-07-09
/client/project/log/2017-07-10
/client/project/log/2017-07-11
/client/project/log/2017-07-12
/client/project/log/2017-07-13
/client/project/log/2017-07-14
/client/project/log/2017-07-15
/client/project/log/2017-07-16
...
/client/project/log/2017-09-16
/client/project/log/2017-09-17
/client/project/log/2017-09-18
/client/project/log/2017-09-19
/client/project/log/2017-09-20
And we want to keep only folders 1 month old.
So the question is how to configure clean up for sinks?
Created 09-20-2017 12:24 PM