Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Create new file after reaching a particular size in Flume

Highlighted

Create new file after reaching a particular size in Flume

New Contributor

I am using flume-ng to transfer data from application server to hdfs in cluster. My problem is "create new log file in hdfs, when file being used reaches size of 2Kb, and start ingesting data into it." But script i am using, below, doesn't do this. It creates a new file whenever a new event is generated. So please tell me how to configure below code, so that file in hdfs should be closed when size reaches 2Kb and start ingesting data into new one.


# define serverAgent-agent

serverAgent.sources = src
serverAgent.channels = ch
serverAgent.sinks = hdfsOut

# configure source
serverAgent.sources.src.type = avro
serverAgent.sources.src.bind = 192.168.3.23
serverAgent.sources.src.port = 54249
serverAgent.sources.src.channels = ch


# configure channel

serverAgent.channels.ch.type = memory
serverAgent.channels.ch.capacity = 100

# configure sinks

serverAgent.sinks.hdfsOut.type = hdfs
serverAgent.sinks.hdfsOut.channel = ch
serverAgent.sinks.hdfsOut.hdfs.path = hdfs://192.168.3.23:8020/data

serverAgent.sinks.hdfsOut.hdfs.fileType = DataStream
serverAgent.sinks.hdfsOut.hdfs.writeFormat = Text
serverAgent.sinks.hdfsOut.hdfs.rollSize = 2048
serverAgent.sinks.hdfsOut.hdfs.rollCount = 0
serverAgent.sinks.hdfsOut.hdfs.rollInterval = 300
#serverAgent.sinks.hdfsOut.hdfs.idleTimeout = 3600

2 REPLIES 2
Highlighted

Re: Create new file after reaching a particular size in Flume

Master Guru
What is the average size of each of your event from the source? What are the sizes of files being created on HDFS?

Re: Create new file after reaching a particular size in Flume

please set "serverAgent.sinks.hdfsOut.hdfs.rollInterval = 0" and give a try.

 

serverAgent.sinks.hdfsOut.hdfs.rollSize = 2048
serverAgent.sinks.hdfsOut.hdfs.rollCount = 0
serverAgent.sinks.hdfsOut.hdfs.rollInterval = 0

 

Best Regards,
Bommuraj

Don't have an account?
Coming from Hortonworks? Activate your account here