Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

NIFI - ListSFTP / FETCHSFTP / PUTHDFS

avatar
Rising Star

Hi all,

I'm running cluster nifi with 4 nodes.

how I would like to setup a dataflow with sftp processors.

It is necessary to have RPG between listsftp and fetchsftp ?

Or can i simply make

listsftp (primary node) --> fetchsftp (all nodes) --> puthdfs (all nodes)

regards

1 ACCEPTED SOLUTION

avatar

@mayki wogno if you want to use the S2S protocol to distribute the SFTP fetches over the 4 NiFi nodes, then it will be necessary to have an RPG. ListSFTP would be configured to only run on the primary node and would connect to the RPG, which would point back to the same NiFi cluster.

8654-screen-shot-2016-10-18-at-102755-am.png

You would then connect the associated input port with the process group containing the FetchSFTP and PutHDFS processors.

8655-screen-shot-2016-10-18-at-103033-am.png

In a NiFi cluster, each node is processing the same dataflow (with the exception of Isolated Processors like ListSFTP only run on the primary node). Without a distribution mechanism such as the S2S protocol, there is no means to partition the file listing metadata so that each processing node fetches a distinct subset of the files on the SFTP server.

View solution in original post

4 REPLIES 4

avatar

@mayki wogno if you want to use the S2S protocol to distribute the SFTP fetches over the 4 NiFi nodes, then it will be necessary to have an RPG. ListSFTP would be configured to only run on the primary node and would connect to the RPG, which would point back to the same NiFi cluster.

8654-screen-shot-2016-10-18-at-102755-am.png

You would then connect the associated input port with the process group containing the FetchSFTP and PutHDFS processors.

8655-screen-shot-2016-10-18-at-103033-am.png

In a NiFi cluster, each node is processing the same dataflow (with the exception of Isolated Processors like ListSFTP only run on the primary node). Without a distribution mechanism such as the S2S protocol, there is no means to partition the file listing metadata so that each processing node fetches a distinct subset of the files on the SFTP server.

avatar
Rising Star

thanks, I'll try it and tell you it is ok.

avatar
Rising Star

@Slachterman thanks.. For RPG, if with secured cluster Nifi that URL is used?

Https://nifi001:9443/?

avatar

That's right, the URL would specify HTTPS and the port on which NiFi is running on that host. With the new masterless architecture in HDF 2.0, the URL specified in the RPG can be any cluster node (in previous versions it had to be the NCM).

Please accept the above answer if it was helpful to you.