Created 04-19-2017 10:36 AM
Getting files from FTP, where we can use ListSFTP and then FetchSTP to get file instead of using GetSFTP processor to get. What could be the advantage of having ListSFTP+FetchFTP over GETSFTP?
Created 04-19-2017 01:30 PM
FetchSFTP is cluster friendly where as GetSFTP is not, meaning you can have multiple nodes in a cluster pulling data from the same source, but with GetSFTP, you cannot.
If you have a NiFi cluster and you are using the GetSFTP processor, you would have to configure that processor to run on the primary node only so the other nodes in the cluster wouldn't try to pull the same files. This will result in the files only being pulled by a single node in a cluster instead of being pulled by all nodes, which is what FetchSFTP would do.
Do you plan to leave the files on the remote system or pull them and remove them?
Do you have a cluster or standalone instance of NiFi?
Created 04-19-2017 01:30 PM
FetchSFTP is cluster friendly where as GetSFTP is not, meaning you can have multiple nodes in a cluster pulling data from the same source, but with GetSFTP, you cannot.
If you have a NiFi cluster and you are using the GetSFTP processor, you would have to configure that processor to run on the primary node only so the other nodes in the cluster wouldn't try to pull the same files. This will result in the files only being pulled by a single node in a cluster instead of being pulled by all nodes, which is what FetchSFTP would do.
Do you plan to leave the files on the remote system or pull them and remove them?
Do you have a cluster or standalone instance of NiFi?
Created 04-19-2017 01:39 PM
We want to keep the file there only. I am using NiFi in cluster.
Created 04-19-2017 05:44 PM
So, using ListSFTP retains state and will only add new files to the list when it is run, so the FetchSFTP will only pull each file once, where with GetSFTP, you would pull the same files over and over again.
Created 04-20-2017 06:53 AM
@Wynner Thank you for the answer.
And also need one more help. Do you have any documents or reference for Best practice used in NiFi data flow development ?
Created 04-20-2017 01:07 PM
Here are a couple of links to articles that are a good starting point for best practices and dataflow design.
Created 04-20-2017 01:22 PM
In addition to excellent resources/information provided by @Wynner, I just wanted to add:
https://pierrevillard.com/2017/02/23/listfetch-pattern-and-remote-process-group-in-apache-nifi