Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to put files in flume spooldir one by one

avatar
Explorer

i am using flume spooldir to put files in HDFS , but i am getting so many small files in HDFS. I thought of using batch size and roll interval but i don't want to get dependent on size and interval. So I decided to push files in flume spooldir one at a time. How can i do this ? Please help

1 ACCEPTED SOLUTION

avatar
Champion

You can refer Hdfs sink timestamp escape sequence , there is alot of them you can use accordingly . 

 

example 

U can use hdfs bucketing , for every one hour. 

agen1.sinks.hdfsSinks.hdfs.path = /data/flume/%{aa}/%y/%m/%d/%H/%M
agent1.sinks.hdfsSinks.hdfs.round = true
agen1.sinks.hdfsSinks.roundUnit = hour
agen1.sinks.hdfsSinks.roundValue = 1 

View solution in original post

3 REPLIES 3

avatar
Champion

Try using the timestamp interceptor. 

avatar
Explorer
any examples you have ?

avatar
Champion

You can refer Hdfs sink timestamp escape sequence , there is alot of them you can use accordingly . 

 

example 

U can use hdfs bucketing , for every one hour. 

agen1.sinks.hdfsSinks.hdfs.path = /data/flume/%{aa}/%y/%m/%d/%H/%M
agent1.sinks.hdfsSinks.hdfs.round = true
agen1.sinks.hdfsSinks.roundUnit = hour
agen1.sinks.hdfsSinks.roundValue = 1