Reply
Highlighted
Explorer
Posts: 16
Registered: ‎05-12-2016
Accepted Solution

How to put files in flume spooldir one by one

i am using flume spooldir to put files in HDFS , but i am getting so many small files in HDFS. I thought of using batch size and roll interval but i don't want to get dependent on size and interval. So I decided to push files in flume spooldir one at a time. How can i do this ? Please help

Champion
Posts: 776
Registered: ‎05-16-2016

Re: How to put files in flume spooldir one by one

Try using the timestamp interceptor. 

Explorer
Posts: 16
Registered: ‎05-12-2016

Re: How to put files in flume spooldir one by one

any examples you have ?
Champion
Posts: 776
Registered: ‎05-16-2016

Re: How to put files in flume spooldir one by one

You can refer Hdfs sink timestamp escape sequence , there is alot of them you can use accordingly . 

 

example 

U can use hdfs bucketing , for every one hour. 

agen1.sinks.hdfsSinks.hdfs.path = /data/flume/%{aa}/%y/%m/%d/%H/%M
agent1.sinks.hdfsSinks.hdfs.round = true
agen1.sinks.hdfsSinks.roundUnit = hour
agen1.sinks.hdfsSinks.roundValue = 1 
Announcements
New solutions