Support Questions
Find answers, ask questions, and share your expertise

NIFI - ListSFTP / FETCHSFTP / PUTHDFS

Solved Go to solution
Highlighted

NIFI - ListSFTP / FETCHSFTP / PUTHDFS

Explorer

Hi all,

I'm running cluster nifi with 4 nodes.

how I would like to setup a dataflow with sftp processors.

It is necessary to have RPG between listsftp and fetchsftp ?

Or can i simply make

listsftp (primary node) --> fetchsftp (all nodes) --> puthdfs (all nodes)

regards

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: NIFI - ListSFTP / FETCHSFTP / PUTHDFS

@mayki wogno if you want to use the S2S protocol to distribute the SFTP fetches over the 4 NiFi nodes, then it will be necessary to have an RPG. ListSFTP would be configured to only run on the primary node and would connect to the RPG, which would point back to the same NiFi cluster.

8654-screen-shot-2016-10-18-at-102755-am.png

You would then connect the associated input port with the process group containing the FetchSFTP and PutHDFS processors.

8655-screen-shot-2016-10-18-at-103033-am.png

In a NiFi cluster, each node is processing the same dataflow (with the exception of Isolated Processors like ListSFTP only run on the primary node). Without a distribution mechanism such as the S2S protocol, there is no means to partition the file listing metadata so that each processing node fetches a distinct subset of the files on the SFTP server.

View solution in original post

4 REPLIES 4
Highlighted

Re: NIFI - ListSFTP / FETCHSFTP / PUTHDFS

@mayki wogno if you want to use the S2S protocol to distribute the SFTP fetches over the 4 NiFi nodes, then it will be necessary to have an RPG. ListSFTP would be configured to only run on the primary node and would connect to the RPG, which would point back to the same NiFi cluster.

8654-screen-shot-2016-10-18-at-102755-am.png

You would then connect the associated input port with the process group containing the FetchSFTP and PutHDFS processors.

8655-screen-shot-2016-10-18-at-103033-am.png

In a NiFi cluster, each node is processing the same dataflow (with the exception of Isolated Processors like ListSFTP only run on the primary node). Without a distribution mechanism such as the S2S protocol, there is no means to partition the file listing metadata so that each processing node fetches a distinct subset of the files on the SFTP server.

View solution in original post

Highlighted

Re: NIFI - ListSFTP / FETCHSFTP / PUTHDFS

Explorer

thanks, I'll try it and tell you it is ok.

Highlighted

Re: NIFI - ListSFTP / FETCHSFTP / PUTHDFS

Explorer

@Slachterman thanks.. For RPG, if with secured cluster Nifi that URL is used?

Https://nifi001:9443/?

Highlighted

Re: NIFI - ListSFTP / FETCHSFTP / PUTHDFS

That's right, the URL would specify HTTPS and the port on which NiFi is running on that host. With the new masterless architecture in HDF 2.0, the URL specified in the RPG can be any cluster node (in previous versions it had to be the NCM).

Please accept the above answer if it was helpful to you.