Support Questions

Find answers, ask questions, and share your expertise

How to put files in flume spooldir one by one

avatar
Explorer

i am using flume spooldir to put files in HDFS , but i am getting so many small files in HDFS. I thought of using batch size and roll interval but i don't want to get dependent on size and interval. So I decided to push files in flume spooldir one at a time. How can i do this ? Please help

1 ACCEPTED SOLUTION

avatar
Champion

You can refer Hdfs sink timestamp escape sequence , there is alot of them you can use accordingly . 

 

example 

U can use hdfs bucketing , for every one hour. 

agen1.sinks.hdfsSinks.hdfs.path = /data/flume/%{aa}/%y/%m/%d/%H/%M
agent1.sinks.hdfsSinks.hdfs.round = true
agen1.sinks.hdfsSinks.roundUnit = hour
agen1.sinks.hdfsSinks.roundValue = 1 

View solution in original post

3 REPLIES 3

avatar
Champion

Try using the timestamp interceptor. 

avatar
Explorer
any examples you have ?

avatar
Champion

You can refer Hdfs sink timestamp escape sequence , there is alot of them you can use accordingly . 

 

example 

U can use hdfs bucketing , for every one hour. 

agen1.sinks.hdfsSinks.hdfs.path = /data/flume/%{aa}/%y/%m/%d/%H/%M
agent1.sinks.hdfsSinks.hdfs.round = true
agen1.sinks.hdfsSinks.roundUnit = hour
agen1.sinks.hdfsSinks.roundValue = 1