Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Flume error while testing spooldir source

avatar
New Member

While testing Flume spooldir source, I am getting this error in flume logs.

28 Jun 2017 15:06:44,560 ERROR [pool-7-thread-1] (org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run:280) - FATAL: 
Spool Directory source spooldir-source: { spoolDir: /data/src/input }: Uncaught exception in SpoolDirectorySource thread. 
Restart or reconfigure Flume to continue processing. 
java.lang.IllegalStateException: File has changed size since being read: /home/userapp/test/newfile_5_1.csv.gz 
at org.apache.flume.client.avro.ReliableSpoolingFileEventReader.retireCurrentFile(ReliableSpoolingFileEventReader.java:410) 
at org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents(ReliableSpoolingFileEventReader.java:326) 
at org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:250) 
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) 
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) 
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) 
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
at java.lang.Thread.run(Thread.java:745)

Any pointers?

1 ACCEPTED SOLUTION

avatar
Rising Star

@rpulluru

This issue occurs when one of the following is true:

  • If a file is written to after being placed into the spooling directory, Flume will print an error to its log file and stop processing.
  • If a file name is reused at a later time, Flume will print an error to its log file and stop processing.

If you are copying the files in your /data/src/input directory, change the operation to ‘mv’, Or you can copy the files as .tmp and then 'mv' the '.tmp' file to the same spooling directory with the actual name. Add the following line in flume.conf to ignore .tmp files in SpoolDir:

Agent1.sources.spooldir-source.ignorePattern=^.*\.tmp$

View solution in original post

2 REPLIES 2

avatar
Rising Star

@rpulluru

This issue occurs when one of the following is true:

  • If a file is written to after being placed into the spooling directory, Flume will print an error to its log file and stop processing.
  • If a file name is reused at a later time, Flume will print an error to its log file and stop processing.

If you are copying the files in your /data/src/input directory, change the operation to ‘mv’, Or you can copy the files as .tmp and then 'mv' the '.tmp' file to the same spooling directory with the actual name. Add the following line in flume.conf to ignore .tmp files in SpoolDir:

Agent1.sources.spooldir-source.ignorePattern=^.*\.tmp$

avatar
New Member

@Shashank Chandhok Thanks. I was copying the files. Moving them helped.