<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Some questions with Flume in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Some-questions-with-Flume/m-p/87768#M45405</link>
    <description>Please create a new thread for distinct questions, instead of bumping an&lt;BR /&gt;older, resolved thread.&lt;BR /&gt;&lt;BR /&gt;As to your question, the error is clear as is the documentation, quoted&lt;BR /&gt;below:&lt;BR /&gt;&lt;BR /&gt;"""&lt;BR /&gt;Spooling Directory Source&lt;BR /&gt;&lt;BR /&gt;This source lets you ingest data by placing files to be ingested into a&lt;BR /&gt;“spooling” directory on disk. This source will watch the specified&lt;BR /&gt;directory for new files, and will parse events out of new files as they&lt;BR /&gt;appear. The event parsing logic is pluggable. After a given file has been&lt;BR /&gt;fully read into the channel, it is renamed to indicate completion (or&lt;BR /&gt;optionally deleted).&lt;BR /&gt;&lt;BR /&gt;Unlike the Exec source, this source is reliable and will not miss data,&lt;BR /&gt;even if Flume is restarted or killed. In exchange for this reliability,&lt;BR /&gt;only immutable, uniquely-named files must be dropped into the spooling&lt;BR /&gt;directory. Flume tries to detect these problem conditions and will fail&lt;BR /&gt;loudly if they are violated:&lt;BR /&gt;&lt;BR /&gt;If a file is written to after being placed into the spooling directory,&lt;BR /&gt;Flume will print an error to its log file and stop processing.&lt;BR /&gt;If a file name is reused at a later time, Flume will print an error to its&lt;BR /&gt;log file and stop processing.&lt;BR /&gt;""" -&lt;BR /&gt;&lt;A href="https://archive.cloudera.com/cdh5/cdh/5/flume-ng/FlumeUserGuide.html#spooling-directory-source" target="_blank"&gt;https://archive.cloudera.com/cdh5/cdh/5/flume-ng/FlumeUserGuide.html#spooling-directory-source&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;It appears that you can get around this by using ExecSource with a script&lt;BR /&gt;or command that reads the files, but you'll have to sacrifice reliability.&lt;BR /&gt;It may be instead worth investing in an approach that makes filenames&lt;BR /&gt;unique (`uuidgen` named softlinks in another folder, etc.)&lt;BR /&gt;</description>
    <pubDate>Thu, 14 Mar 2019 01:52:51 GMT</pubDate>
    <dc:creator>Harsh J</dc:creator>
    <dc:date>2019-03-14T01:52:51Z</dc:date>
    <item>
      <title>Some questions with Flume</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Some-questions-with-Flume/m-p/84221#M45398</link>
      <description>&lt;P&gt;Hi,&lt;BR /&gt;I wanted to use Flume to send a large amount of files to hadoop and I had the idea of ​​using spool, but I have some questions like this:&lt;BR /&gt;1. When sending files to hadoop, the files in the spool are not moved anywhere, which makes me wonder if there is a new file in the spool, how does Flume recognize the old and new files?&lt;BR /&gt;2. How does Flume after uploading the file to hadoop, will the files in the spool be moved to another folder? Or does Flume have a mechanism to back up files?&lt;BR /&gt;3. I know that Flume has some properties that help work with regex, but I don't know if Flume supports sending files to hadoop and sorting those files into regex-based directories? If so, how do I do it?&lt;BR /&gt;4. Does Flume support sending files to hadoop and categorizing them into directories based on the date sent? (I have read that part in HDFS Sink but when I tried it failed)&lt;BR /&gt;5. While using Flume to send files to hadoop, can I fix the file contents such as adding file names into the data stream, or changing the ";" into "|"?&lt;BR /&gt;6. Can I use any API, or any tool to monitor Flume file transfer to hadoop? For example, during file transfer, see how many files have been transferred to hadoop or how many files have been successfully submitted and how many files sent to hadoop failed.&lt;BR /&gt;7. Does Flume record transaction logs with hadoop? For example, how many files have been uploaded to hadoop, ...&lt;BR /&gt;I know that I asked too much, but I am really confused with Flume and I really need your help. Look forward to your help. Thanks&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 14:00:28 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Some-questions-with-Flume/m-p/84221#M45398</guid>
      <dc:creator>AlohaDecember</dc:creator>
      <dc:date>2022-09-16T14:00:28Z</dc:date>
    </item>
    <item>
      <title>Re: Some questions with Flume</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Some-questions-with-Flume/m-p/84222#M45399</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Please find the answers below:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;1. When sending files to hadoop, the files in the spool are not moved anywhere, which makes me wonder if there is a new file in the spool, how does Flume recognize the old and new files?&lt;/P&gt;&lt;P&gt;Ans: The files get renamed and a suffix is added to the completely ingested file from spool dir, see the following configuration:&lt;/P&gt;&lt;P&gt;fileSuffix .COMPLETED Suffix to append to completely ingested files&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;2. How does Flume after uploading the file to hadoop, will the files in the spool be moved to another folder? Or does Flume have a mechanism to back up files?&lt;/P&gt;&lt;P&gt;Ans: Same as above, It is renamed with suffix.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;3. I know that Flume has some properties that help work with regex, but I don't know if Flume supports sending files to hadoop and sorting those files into regex-based directories? If so, how do I do it?&lt;BR /&gt;Ans: You can use the HDFS directory path with certain formatting escape sequences that will replaced by the HDFS sink to generate a directory/file name to store the events.&lt;/P&gt;&lt;P&gt;For example to store the file in different&amp;nbsp;directory based on dates&lt;/P&gt;&lt;P&gt;hdfs.path = /flume/%Y-%m-%d&lt;/P&gt;&lt;P&gt;For more detail on the escape sequence see the following link:&lt;/P&gt;&lt;P&gt;&lt;A href="https://flume.apache.org/FlumeUserGuide.html#hdfs-sink" target="_blank"&gt;https://flume.apache.org/FlumeUserGuide.html#hdfs-sink&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;4. Does Flume support sending files to hadoop and categorizing them into directories based on the date sent? (I have read that part in HDFS Sink but when I tried it failed)&lt;BR /&gt;Ans: - If you give the configuration you are using , I can try to fix the issues with it.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;5. While using Flume to send files to hadoop, can I fix the file contents such as adding file names into the data stream, or changing the ";" into "|"?&lt;BR /&gt;Ans:&lt;/P&gt;&lt;P&gt;If you just want to add the file name to the data, you should try following configuration for the spplodir source type:&lt;/P&gt;&lt;P&gt;basenameHeader false Whether to add a header storing the basename of the file.&lt;BR /&gt;basenameHeaderKey basename Header Key to use when appending basename of file to event header.&lt;/P&gt;&lt;P&gt;If you want to do regex replace , you will have&amp;nbsp;to use Search and Replace Interceptor&lt;BR /&gt;You can specify the search regex and replace string. See the following link:&lt;/P&gt;&lt;P&gt;&lt;A href="https://flume.apache.org/FlumeUserGuide.html#search-and-replace-interceptor" target="_blank"&gt;https://flume.apache.org/FlumeUserGuide.html#search-and-replace-interceptor&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;6. Can I use any API, or any tool to monitor Flume file transfer to hadoop? For example, during file transfer, see how many files have been transferred to hadoop or how many files have been successfully submitted and how many files sent to hadoop failed.&lt;BR /&gt;Ans: Not sure if anything available for spooldir, but you should see the Monitoring section and see if you can use something&lt;BR /&gt;&lt;A href="https://flume.apache.org/FlumeUserGuide.html#monitoring" target="_blank"&gt;https://flume.apache.org/FlumeUserGuide.html#monitoring&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;7. Does Flume record transaction logs with hadoop? For example, how many files have been uploaded to hadoop, ...&lt;BR /&gt;Ans: I don't think so but might need more research to see if you can track what&amp;nbsp;all files has been written. You can check your spool dir for files sent.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks&amp;nbsp;&lt;/P&gt;&lt;P&gt;Bimal&lt;/P&gt;</description>
      <pubDate>Sun, 23 Dec 2018 21:06:00 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Some-questions-with-Flume/m-p/84222#M45399</guid>
      <dc:creator>Bimalc</dc:creator>
      <dc:date>2018-12-23T21:06:00Z</dc:date>
    </item>
    <item>
      <title>Re: Some questions with Flume</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Some-questions-with-Flume/m-p/84247#M45400</link>
      <description>&lt;P&gt;Thank you very much for helping me, but I have some questions to help:&lt;BR /&gt;1. If the files are not moved to another folder (like questions 1 and 2 I mentioned), when the folder is too many files, for example 1 billion files, the server is full, I have to do that what? Maybe I have to reconfigure with another spool folder?&lt;BR /&gt;2. This is the configuration file I wanted to mention in question 5&lt;/P&gt;&lt;PRE&gt;# Sources, channels, and sinks are defined per
# agent name, in this case 'tier1'.
tier1.sources  = source1
tier1.channels = channel1
tier1.sinks    = sink1

# For each source, channel, and sink, set
# standard properties.
# source details
tier1.sources.source1.type     = spooldir
tier1.sources.source1.spoolDir = /data/diem
tier1.sources.source1.fileHeader = false
tier1.sources.source1.fileSuffix  = .COMPLETED
tier1.sources.source1.channels = channel1
tier1.sources.source1.interceptors = i1
tier1.sources.source1.interceptors.i1.type = regex_extractor
tier1.sources.source1.interceptors.i1.regex = \\[(.*?)\\]
tier1.sources.source1.interceptors.i1.serializers = s1
tier1.sources.source1.interceptors.i1.serializers.s1.type = org.apache.flume.interceptor.RegexExtractorInterceptorMillisSerializer
tier1.sources.source1.interceptor.serializers.s1.name = timestamp
tier1.sources.source1.serializers.s1.pattern = yyyy-MM-dd HH:mm:ss

# channel details
tier1.channels.channel1.type   = file
tier1.channels.channel1.capacity = 200000
tier1.channels.channel1.transactionCapacity = 1000

# sink details
tier1.sinks.sink1.type         = HDFS
tier1.sinks.sink1.fileType = DataStream
tier1.sinks.sink1.hdfs.writeFormat  = Text
tier1.sinks.sink1.channel      = channel1
tier1.sinks.sink1.hdfs.path = hdfs://localhost:8020/user/cloudera/testFolder/%y-%m-%d/%H%M/%S
tier1.sinks.sink1.round = true
tier1.sinks.sink1.roundValue = 10
tier1.sinks.sink1.roundUnit = minute
tier1.sinks.sink1.hdfs.rollSize = 268435456
tier1.sinks.sink1.rollInterval = 0
tier1.sinks.sink1.hdfs.batchSize = 10000

&lt;/PRE&gt;&lt;P&gt;And this is an error in the log file&lt;/P&gt;&lt;PRE&gt;2018-12-24 11:56:03,065 ERROR org.apache.flume.sink.hdfs.HDFSEventSink: process failed
java.lang.NullPointerException: Expected timestamp in the Flume event headers, but it was null
    at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204)
    at org.apache.flume.formatter.output.BucketPath.replaceShorthand(BucketPath.java:251)
    at org.apache.flume.formatter.output.BucketPath.escapeString(BucketPath.java:460)
    at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:368)
    at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67)
    at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145)
    at java.lang.Thread.run(Thread.java:745)
2018-12-24 11:56:03,069 ERROR org.apache.flume.SinkRunner: Unable to deliver event. Exception follows.
org.apache.flume.EventDeliveryException: java.lang.NullPointerException: Expected timestamp in the Flume event headers, but it was null
    at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:451)
    at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67)
    at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException: Expected timestamp in the Flume event headers, but it was null
    at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204)
    at org.apache.flume.formatter.output.BucketPath.replaceShorthand(BucketPath.java:251)
    at org.apache.flume.formatter.output.BucketPath.escapeString(BucketPath.java:460)
    at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:368)
    ... 3 more&lt;/PRE&gt;&lt;P&gt;And once again thank you for helping me answer these questions&lt;/P&gt;</description>
      <pubDate>Mon, 24 Dec 2018 20:00:48 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Some-questions-with-Flume/m-p/84247#M45400</guid>
      <dc:creator>AlohaDecember</dc:creator>
      <dc:date>2018-12-24T20:00:48Z</dc:date>
    </item>
    <item>
      <title>Re: Some questions with Flume</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Some-questions-with-Flume/m-p/84273#M45401</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;1.&amp;nbsp;&lt;SPAN&gt;1. If the files are not moved to another folder (like questions 1 and 2 I mentioned), when the folder is too many files, for example 1 billion files, the server is full, I have to do that what? Maybe I have to reconfigure with another spool folder?&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Ans: You can configure flume to delete it so that the files does not keep on accumulating in your directory.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;deletePolicy never When to delete completed files: never or immediate&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;2. The error you are getting is due to the regex and pattern being incorrect.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This combination works :&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;tier1.sources.source1.interceptors.i1.regex = ^(?:\\[)(\\d\\d\\d\\d-\\d\\d-\\d\\d\\s\\d\\d:\\d\\d:\\d\\d)&lt;BR /&gt;tier1.sources.source1.interceptors.i1.serializers.s1.pattern = yyyy-MM-dd HH:mm:ss&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;With the above regex we are matching anything starting with [dddd-dd-dd dd:dd:dd and discarding the starting [ and picking the rest of the pattern. That captured data matches the pattern&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;yyyy-MM-dd HH:mm:ss and it is correctly translated to timestamp.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;So [&lt;SPAN class="pre"&gt;2012-10-18&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class="pre"&gt;18:47:57] ...&amp;nbsp; will be interpretted properly and converted into timestamp.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="pre"&gt;If the regex and pattern does not map then you will not get a timestamp in the header.&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class="pre"&gt;With your regex the selected group does not matches to the pattern&amp;nbsp;&lt;SPAN&gt;yyyy-MM-dd HH:mm:ss and hence the timestamp in header comes as Null and you get the exception.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="pre"&gt;&lt;SPAN&gt;Plesae let me know if you have any question.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="pre"&gt;&lt;SPAN&gt;Regards&lt;BR /&gt;Bimal&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 26 Dec 2018 21:36:39 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Some-questions-with-Flume/m-p/84273#M45401</guid>
      <dc:creator>Bimalc</dc:creator>
      <dc:date>2018-12-26T21:36:39Z</dc:date>
    </item>
    <item>
      <title>Re: Some questions with Flume</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Some-questions-with-Flume/m-p/84280#M45402</link>
      <description>A word of caution:&lt;BR /&gt;Flume isn't really designed for transferring files of large sizes. It would be recommended for you to use oozie or an nfs gateway with cron to transfer files on a regular basis, especially if you want the file preserved in its entirety. One of the things that you will observe, is that if flume has any temporary transmission errors, it will attempt to resend parts of those files, which will result in duplicates (a standard and expected scenario when using flume), and so your resultant files in hdfs would have those duplicates within them. Additionally, when you do have interruptions, existing hdfs files are closed and new ones are opened.&lt;BR /&gt;&lt;BR /&gt;-pd</description>
      <pubDate>Wed, 26 Dec 2018 23:50:08 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Some-questions-with-Flume/m-p/84280#M45402</guid>
      <dc:creator>pdvorak</dc:creator>
      <dc:date>2018-12-26T23:50:08Z</dc:date>
    </item>
    <item>
      <title>Re: Some questions with Flume</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Some-questions-with-Flume/m-p/84287#M45403</link>
      <description>I got it, thank you very much &lt;span class="lia-unicode-emoji" title=":grinning_face_with_smiling_eyes:"&gt;😄&lt;/span&gt;</description>
      <pubDate>Thu, 27 Dec 2018 03:11:52 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Some-questions-with-Flume/m-p/84287#M45403</guid>
      <dc:creator>AlohaDecember</dc:creator>
      <dc:date>2018-12-27T03:11:52Z</dc:date>
    </item>
    <item>
      <title>Re: Some questions with Flume</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Some-questions-with-Flume/m-p/87680#M45404</link>
      <description>&lt;P&gt;Hi I hava the files moving from source to destination----&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;after the sending the files from source directory flume is making them as completed&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;in souce directory&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; ex : test1.completed&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;test2.completed&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;test3.completed&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;/P&gt;&lt;P&gt;now my question is, if i get one more file with name as test1&amp;nbsp; &amp;nbsp;in source folder&amp;nbsp; -- it is throwing below error(how can i overwrite it)&amp;nbsp; -- and process will be halted&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;2019-03-13 06:46:42,620 (pool-13-thread-1) [ERROR - org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:280)] FATAL: Spool Directory source flow: { spoolDir: /opt/mount1/FlowTest/factoryFlowPath }: Uncaught exception in SpoolDirectorySource thread. Restart or reconfigure Flume to continue processing.&lt;BR /&gt;&lt;STRONG&gt;java.lang.IllegalStateException: File name has been re-used with different files. Spooling assumptions violated for&lt;/STRONG&gt; /opt/mount1/FlowTest/factoryFlowPath/IIPGSITIWS.InventecINVSH_TXT_FLOW_NOV_21_2018_14H_26m_32s.txt.COMPLETED&lt;BR /&gt;at org.apache.flume.client.avro.ReliableSpoolingFileEventReader.rollCurrentFile(ReliableSpoolingFileEventReader.java:463)&lt;BR /&gt;at org.apache.flume.client.avro.ReliableSpoolingFileEventReader.retireCurrentFile(ReliableSpoolingFileEventReader.java:414)&lt;BR /&gt;at org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents(ReliableSpoolingFileEventReader.java:326)&lt;BR /&gt;at org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:250)&lt;BR /&gt;at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)&lt;BR /&gt;at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)&lt;BR /&gt;at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)&lt;BR /&gt;at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)&lt;BR /&gt;at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)&lt;BR /&gt;at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)&lt;BR /&gt;at java.lang.Thread.run(Thread.java:748)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;is there any way to overwrite exisisting &lt;STRONG&gt;.COMPLETED&lt;/STRONG&gt; method&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;please help on this&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 13 Mar 2019 07:03:01 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Some-questions-with-Flume/m-p/87680#M45404</guid>
      <dc:creator>dinakaran</dc:creator>
      <dc:date>2019-03-13T07:03:01Z</dc:date>
    </item>
    <item>
      <title>Re: Some questions with Flume</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Some-questions-with-Flume/m-p/87768#M45405</link>
      <description>Please create a new thread for distinct questions, instead of bumping an&lt;BR /&gt;older, resolved thread.&lt;BR /&gt;&lt;BR /&gt;As to your question, the error is clear as is the documentation, quoted&lt;BR /&gt;below:&lt;BR /&gt;&lt;BR /&gt;"""&lt;BR /&gt;Spooling Directory Source&lt;BR /&gt;&lt;BR /&gt;This source lets you ingest data by placing files to be ingested into a&lt;BR /&gt;“spooling” directory on disk. This source will watch the specified&lt;BR /&gt;directory for new files, and will parse events out of new files as they&lt;BR /&gt;appear. The event parsing logic is pluggable. After a given file has been&lt;BR /&gt;fully read into the channel, it is renamed to indicate completion (or&lt;BR /&gt;optionally deleted).&lt;BR /&gt;&lt;BR /&gt;Unlike the Exec source, this source is reliable and will not miss data,&lt;BR /&gt;even if Flume is restarted or killed. In exchange for this reliability,&lt;BR /&gt;only immutable, uniquely-named files must be dropped into the spooling&lt;BR /&gt;directory. Flume tries to detect these problem conditions and will fail&lt;BR /&gt;loudly if they are violated:&lt;BR /&gt;&lt;BR /&gt;If a file is written to after being placed into the spooling directory,&lt;BR /&gt;Flume will print an error to its log file and stop processing.&lt;BR /&gt;If a file name is reused at a later time, Flume will print an error to its&lt;BR /&gt;log file and stop processing.&lt;BR /&gt;""" -&lt;BR /&gt;&lt;A href="https://archive.cloudera.com/cdh5/cdh/5/flume-ng/FlumeUserGuide.html#spooling-directory-source" target="_blank"&gt;https://archive.cloudera.com/cdh5/cdh/5/flume-ng/FlumeUserGuide.html#spooling-directory-source&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;It appears that you can get around this by using ExecSource with a script&lt;BR /&gt;or command that reads the files, but you'll have to sacrifice reliability.&lt;BR /&gt;It may be instead worth investing in an approach that makes filenames&lt;BR /&gt;unique (`uuidgen` named softlinks in another folder, etc.)&lt;BR /&gt;</description>
      <pubDate>Thu, 14 Mar 2019 01:52:51 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Some-questions-with-Flume/m-p/87768#M45405</guid>
      <dc:creator>Harsh J</dc:creator>
      <dc:date>2019-03-14T01:52:51Z</dc:date>
    </item>
  </channel>
</rss>

