<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: NiFi ListSFTP and timestamps in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-ListSFTP-and-timestamps/m-p/232780#M70193</link>
    <description>&lt;A rel="user" href="https://community.cloudera.com/users/46218/kfredrickson.html" nodeid="46218"&gt;@Karl Fredrickson&lt;/A&gt;&lt;P&gt;The normal behavior of the ListSFTP processor is the first listing, meaning it has no state yet, will get a listing of all current files in the remote directory. The subsequent listings will get all new files, written since the last time stamp listed in the processor, except for the last one or two files. These one or two files will be listed in the next listing the processor creates and any additional new files based on the updated state for the processor, except for again the latest one or two files and so on as the processor runs.&lt;/P&gt;&lt;P&gt;The ListSFTP processor doesn't use the file name in anyway.&lt;/P&gt;&lt;P&gt;If you want files to be listed closer to the time they are being written to the directory, then set the processor to run more often than every 10 minutes.&lt;/P&gt;</description>
    <pubDate>Wed, 25 Oct 2017 03:33:16 GMT</pubDate>
    <dc:creator>Wynner</dc:creator>
    <dc:date>2017-10-25T03:33:16Z</dc:date>
    <item>
      <title>NiFi ListSFTP and timestamps</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-ListSFTP-and-timestamps/m-p/232777#M70190</link>
      <description>&lt;P&gt;We have been having a problem where a ListSFTP processor in NiFi isn't producing a flowfile for all new files as expected.  The ListSFTP processor is configured to run every 10 minutes and is pointed to an external SFTP server where new files are dropped daily.  Everything has been working as expected until recently when a few of the file types that we regularly download were not picked up by the ListSFTP.  I have a couple questions to help me understand what may be going on here:&lt;/P&gt;&lt;P&gt;1. Does NiFi only look at the "Last Modified" timestamp on the remote file and compare it to the timestamp of the processor's view state to determine if a file is "new"?  (In other words, it doesn't have anything to do with whether the filename has been seen before.)&lt;/P&gt;&lt;P&gt;2. Could this situation be caused by a difference between the "Last Modified" date on the new files and when they actually show up in the SFTP listing.  I believe there are cases where the file doesn't show up until a few minutes after its Last Modified date.  For example, the Last Modified date is 4:18 but the file doesn't show up in the listing until 4:20.&lt;/P&gt;&lt;P&gt;3. If this is actually what is happening, could it be fixed by changing settings on the SFTP processor?&lt;/P&gt;</description>
      <pubDate>Wed, 25 Oct 2017 00:16:45 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-ListSFTP-and-timestamps/m-p/232777#M70190</guid>
      <dc:creator>KFredrickson</dc:creator>
      <dc:date>2017-10-25T00:16:45Z</dc:date>
    </item>
    <item>
      <title>Re: NiFi ListSFTP and timestamps</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-ListSFTP-and-timestamps/m-p/232778#M70191</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/46218/kfredrickson.html" nodeid="46218"&gt;@Karl Fredrickson&lt;/A&gt;
&lt;/P&gt;&lt;P&gt;Yes ListSFTP processor only look for new files that got created after the state that processor stored.&lt;/P&gt;&lt;P&gt;State value is max time stamp of the file created in that directory.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Example:-&lt;/STRONG&gt; &lt;/P&gt;&lt;P&gt;lets assume that &lt;STRONG&gt;listsftp processor &lt;/STRONG&gt;has&lt;STRONG&gt; listed all the files&lt;/STRONG&gt; in the directory until &lt;STRONG&gt;4:10&lt;/STRONG&gt; then processor scheduled to run for every &lt;STRONG&gt;10 minutes &lt;/STRONG&gt;next run is at &lt;STRONG&gt;4:20&lt;/STRONG&gt;&lt;STRONG&gt;.&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;There are &lt;STRONG&gt;new files(test1.txt,test2.txt) &lt;/STRONG&gt;got created at &lt;STRONG&gt;4:11&lt;/STRONG&gt; then these &lt;STRONG&gt;new files(test.txt,test2.txt)&lt;/STRONG&gt; will &lt;STRONG&gt;only be listed&lt;/STRONG&gt; at &lt;STRONG&gt;4:20&lt;/STRONG&gt; run(because processor runs for every 10 mins) and then processor updates the state with the&lt;STRONG&gt; 4:11&lt;/STRONG&gt; time stamp.(you can view by right clicking on the processor and click on view state).&lt;/P&gt;&lt;P&gt;Although flow files &lt;STRONG&gt;got created at 4:11 &lt;/STRONG&gt;still they will be &lt;STRONG&gt;listed only at 4:20&lt;/STRONG&gt; run, because in this run processor checks for the new files that got created after state value.&lt;/P&gt;&lt;P&gt;If you configure this processor to &lt;STRONG&gt;less frequent&lt;/STRONG&gt; i.e less than 10 minutes then processor will looks for new files that got created more often.&lt;/P&gt;</description>
      <pubDate>Wed, 25 Oct 2017 00:57:51 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-ListSFTP-and-timestamps/m-p/232778#M70191</guid>
      <dc:creator>Shu_ashu</dc:creator>
      <dc:date>2017-10-25T00:57:51Z</dc:date>
    </item>
    <item>
      <title>Re: NiFi ListSFTP and timestamps</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-ListSFTP-and-timestamps/m-p/232779#M70192</link>
      <description>&lt;P&gt;Thank you, can you say a little more about what "got created at 4:11" means for the ListSFTP processor?  If someone put the files on the FTP server at 4:11, but the last modified date of the files is earlier than that (say 4:00), would ListSFTP never create flowfiles for them?&lt;/P&gt;</description>
      <pubDate>Wed, 25 Oct 2017 03:06:05 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-ListSFTP-and-timestamps/m-p/232779#M70192</guid>
      <dc:creator>KFredrickson</dc:creator>
      <dc:date>2017-10-25T03:06:05Z</dc:date>
    </item>
    <item>
      <title>Re: NiFi ListSFTP and timestamps</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-ListSFTP-and-timestamps/m-p/232780#M70193</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/46218/kfredrickson.html" nodeid="46218"&gt;@Karl Fredrickson&lt;/A&gt;&lt;P&gt;The normal behavior of the ListSFTP processor is the first listing, meaning it has no state yet, will get a listing of all current files in the remote directory. The subsequent listings will get all new files, written since the last time stamp listed in the processor, except for the last one or two files. These one or two files will be listed in the next listing the processor creates and any additional new files based on the updated state for the processor, except for again the latest one or two files and so on as the processor runs.&lt;/P&gt;&lt;P&gt;The ListSFTP processor doesn't use the file name in anyway.&lt;/P&gt;&lt;P&gt;If you want files to be listed closer to the time they are being written to the directory, then set the processor to run more often than every 10 minutes.&lt;/P&gt;</description>
      <pubDate>Wed, 25 Oct 2017 03:33:16 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-ListSFTP-and-timestamps/m-p/232780#M70193</guid>
      <dc:creator>Wynner</dc:creator>
      <dc:date>2017-10-25T03:33:16Z</dc:date>
    </item>
    <item>
      <title>Re: NiFi ListSFTP and timestamps</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-ListSFTP-and-timestamps/m-p/232781#M70194</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/46218/kfredrickson.html" nodeid="46218"&gt;@Karl Fredrickson&lt;/A&gt;, what i mean to say at 4:11 is &lt;STRONG&gt;file creation time stamp&lt;/STRONG&gt; in the directory&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;For Example:-&lt;/STRONG&gt;&lt;/P&gt;&lt;PRE&gt;bash# hdfs dfs -ls /user/yashu/test_fac/
Found 1 items
-rwxr-xr-x   3 hdfs hdfs          8&lt;STRONG&gt; 2017-10-24 04:11&lt;/STRONG&gt; /user/yashu/test_fac/000000_0&lt;/PRE&gt;&lt;P&gt;in this example 000000_0 file got created at &lt;B&gt;2017-10-24 04:11&lt;/B&gt;(time stamp).&lt;/P&gt;&lt;P&gt;But the processor runs at &lt;STRONG&gt;4:20&lt;/STRONG&gt; that means above&lt;STRONG&gt; &lt;/STRONG&gt;&lt;STRONG&gt;000000_0&lt;/STRONG&gt; file is going to&lt;STRONG&gt; listed in 4:20 run&lt;/STRONG&gt;.&lt;/P&gt;&lt;P&gt;if the last modified date is earlier than 4:00 but someone put the files at 4:11?&lt;/P&gt;&lt;P&gt;then ListSFTP &lt;STRONG&gt;won't create flow files&lt;/STRONG&gt; because it will only pulls new files that &lt;STRONG&gt;got created after the state value.&lt;/STRONG&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 25 Oct 2017 03:34:03 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFi-ListSFTP-and-timestamps/m-p/232781#M70194</guid>
      <dc:creator>Shu_ashu</dc:creator>
      <dc:date>2017-10-25T03:34:03Z</dc:date>
    </item>
  </channel>
</rss>

