Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

ListSFTP+FetchFTP vs GETSFTP ??

avatar
Contributor

Getting files from FTP, where we can use ListSFTP and then FetchSTP to get file instead of using GetSFTP processor to get. What could be the advantage of having ListSFTP+FetchFTP over GETSFTP?

1 ACCEPTED SOLUTION

avatar

@Yogesh Sharma

FetchSFTP is cluster friendly where as GetSFTP is not, meaning you can have multiple nodes in a cluster pulling data from the same source, but with GetSFTP, you cannot.

If you have a NiFi cluster and you are using the GetSFTP processor, you would have to configure that processor to run on the primary node only so the other nodes in the cluster wouldn't try to pull the same files. This will result in the files only being pulled by a single node in a cluster instead of being pulled by all nodes, which is what FetchSFTP would do.

Do you plan to leave the files on the remote system or pull them and remove them?

Do you have a cluster or standalone instance of NiFi?

View solution in original post

6 REPLIES 6

avatar

@Yogesh Sharma

FetchSFTP is cluster friendly where as GetSFTP is not, meaning you can have multiple nodes in a cluster pulling data from the same source, but with GetSFTP, you cannot.

If you have a NiFi cluster and you are using the GetSFTP processor, you would have to configure that processor to run on the primary node only so the other nodes in the cluster wouldn't try to pull the same files. This will result in the files only being pulled by a single node in a cluster instead of being pulled by all nodes, which is what FetchSFTP would do.

Do you plan to leave the files on the remote system or pull them and remove them?

Do you have a cluster or standalone instance of NiFi?

avatar
Contributor

@Wynner

We want to keep the file there only. I am using NiFi in cluster.

avatar

@Yogesh Sharma

So, using ListSFTP retains state and will only add new files to the list when it is run, so the FetchSFTP will only pull each file once, where with GetSFTP, you would pull the same files over and over again.

avatar
Contributor

@Wynner Thank you for the answer.

And also need one more help. Do you have any documents or reference for Best practice used in NiFi data flow development ?

avatar

@Yogesh Sharma

Here are a couple of links to articles that are a good starting point for best practices and dataflow design.

Dataflow Optimization

NiFi Best Practices for high performance

avatar

In addition to excellent resources/information provided by @Wynner, I just wanted to add:

https://pierrevillard.com/2017/02/23/listfetch-pattern-and-remote-process-group-in-apache-nifi