Support Questions

Find answers, ask questions, and share your expertise

Flume error while testing spooldir source

avatar
Explorer

While testing Flume spooldir source, I am getting this error in flume logs.

28 Jun 2017 15:06:44,560 ERROR [pool-7-thread-1] (org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run:280) - FATAL: 
Spool Directory source spooldir-source: { spoolDir: /data/src/input }: Uncaught exception in SpoolDirectorySource thread. 
Restart or reconfigure Flume to continue processing. 
java.lang.IllegalStateException: File has changed size since being read: /home/userapp/test/newfile_5_1.csv.gz 
at org.apache.flume.client.avro.ReliableSpoolingFileEventReader.retireCurrentFile(ReliableSpoolingFileEventReader.java:410) 
at org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents(ReliableSpoolingFileEventReader.java:326) 
at org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:250) 
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) 
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) 
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) 
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
at java.lang.Thread.run(Thread.java:745)

Any pointers?

1 ACCEPTED SOLUTION

avatar
Rising Star

@rpulluru

This issue occurs when one of the following is true:

  • If a file is written to after being placed into the spooling directory, Flume will print an error to its log file and stop processing.
  • If a file name is reused at a later time, Flume will print an error to its log file and stop processing.

If you are copying the files in your /data/src/input directory, change the operation to ‘mv’, Or you can copy the files as .tmp and then 'mv' the '.tmp' file to the same spooling directory with the actual name. Add the following line in flume.conf to ignore .tmp files in SpoolDir:

Agent1.sources.spooldir-source.ignorePattern=^.*\.tmp$

View solution in original post

2 REPLIES 2

avatar
Rising Star

@rpulluru

This issue occurs when one of the following is true:

  • If a file is written to after being placed into the spooling directory, Flume will print an error to its log file and stop processing.
  • If a file name is reused at a later time, Flume will print an error to its log file and stop processing.

If you are copying the files in your /data/src/input directory, change the operation to ‘mv’, Or you can copy the files as .tmp and then 'mv' the '.tmp' file to the same spooling directory with the actual name. Add the following line in flume.conf to ignore .tmp files in SpoolDir:

Agent1.sources.spooldir-source.ignorePattern=^.*\.tmp$

avatar
Explorer

@Shashank Chandhok Thanks. I was copying the files. Moving them helped.