Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Flume Spooling Directory Is Not Working Correctly!

Flume Spooling Directory Is Not Working Correctly!

New Contributor

Hi everyone,

I configure flume spooling directory to write xml files from nfs to hdfs. Everything is working fine while there is xml files in nfs directory. But when all xml files imported to hdfs then flume is hanging and not importing new xml files. Anyone has an idea?

Note: nfs directory is mounted in linux and I'm using this mounted directory.

Here is my flume configuration:

# Define a source, a channel, and a sink

agent.sources = src1

agent.channels = chan1

agent.sinks = sink1

# Set the source type to Spooling Directory and set the directory

# location to /home/flume/ingestion/

agent.sources.src1.type = spooldir

agent.sources.src1.spoolDir = /home/mountfolder/

agent.sources.src1.basenameHeader = true

agent.sources.src1.deletePolicy = immediate

# Configure the channel as simple in-memory queue

agent.channels.chan1.type = memory

agent.channels.chan1.capacity = 10000

# Define the HDFS sink and set its path to your target HDFS directory

agent.sinks.sink1.type = hdfs

agent.sinks.sink1.hdfs.path = hdfs://myserver:8020/flume_sink/

agent.sinks.sink1.hdfs.fileType = DataStream

agent.sinks.sink1.hdfs.rollCount = 100

agent.sinks.sink1.hdfs.rollSize = 0

agent.sinks.sink1.hdfs.idleTimeout = 60

# Disable rollover functionallity as we want to keep the original files

agent.sinks.sink1.rollInterval = 0

# Set the files to their original name

agent.sinks.sink1.hdfs.filePrefix = %{basename}

agent.sinks.sink1.hdfs.fileSuffix = .xml

# Connect source and sink

agent.sources.src1.channels = chan1

agent.sinks.sink1.channel = chan1

1 REPLY 1
Highlighted

Re: Flume Spooling Directory Is Not Working Correctly!

Expert Contributor

Hi,

Can you please increase the following attribute values.

agent.sinks.sink1.hdfs.rollCount = 100 (Number of events) please increase it to 1000.

agent.sinks.sink1.hdfs.rollSize = 0 (Should be byte) make it to 5000.

agent.sinks.sink1.rollInterval = 0 (Should be 500 seconds)

Please try by using above values hope it will work.

Thanks,

Mahesh

Don't have an account?
Coming from Hortonworks? Activate your account here