Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

NiFi - How to use the GetSFTP processor without repeating files

avatar
Expert Contributor

Is there a way I can configure nifi, not to pull same files over? It appears to pull the same files more than once if some files in the directory were modified.

I.e. could you explain the full algorithm behavior on how GetSFTP tracks which files it has downloaded?

And how does this behave in case NiFi process or the server restarts?

1 ACCEPTED SOLUTION

avatar

Wes,

The next version of NiFi will have new components like FetchSFTP and ListSFTP [1]:

  1. FetchSFTP will take an info from an incoming message and download the requested file.
  2. ListSFTP will list the contents of a remote dir and send flow files down the pipe to process/download next. I think it will be able to use a distributed cache as well to maintain state on the NiFi state and avoid re-processing (e.g. we don't always have an option of deleting on a remote server).

I have already used the FetchSFTP to tell it which files to get, triggered by an external notification.

[1] https://issues.apache.org/jira/browse/NIFI-673

View solution in original post

5 REPLIES 5

avatar
Contributor

There is the "Delete Original" property which is set to 'true' by default. Is that what you are looking for? With that in mind I can't understand what do you mean about files been modified as one would expect them to be gone, can you clarify?

avatar

Wes,

The next version of NiFi will have new components like FetchSFTP and ListSFTP [1]:

  1. FetchSFTP will take an info from an incoming message and download the requested file.
  2. ListSFTP will list the contents of a remote dir and send flow files down the pipe to process/download next. I think it will be able to use a distributed cache as well to maintain state on the NiFi state and avoid re-processing (e.g. we don't always have an option of deleting on a remote server).

I have already used the FetchSFTP to tell it which files to get, triggered by an external notification.

[1] https://issues.apache.org/jira/browse/NIFI-673

avatar
New Contributor

Hi Andrew, Can you please share the template with us? Not sure how to configure FetchSFTP to read output from ListSFTP. It keeps asking for the remote file location Thank you !

avatar

list-and-fetch-sftp-templatexml.zipTry the attached template. Also check out the repo our team is populating here: https://community.hortonworks.com/repos/6119/apache-nifi-template-rpo.html

avatar
New Contributor

Hi,

Is there any way where we can trigger ListSFTP based on some status condition. Say ListSFTP should pick or transfer files based on some status. for example if status is start it should start fetching files and if it something else then it should stop.