Support Questions

Find answers, ask questions, and share your expertise

"Remote File" in "FetchFTP"

avatar
Contributor

I got a new problem... - can't understand what need to enter and in what format into "Remote File" in "FetchFTP" ...

77723-112.jpg

1 ACCEPTED SOLUTION

avatar
Master Mentor

@Vladislav Shcherbakov

-

The FetchFTP processor is designed to retrieve a single File per execution. The data returned is added as content to the FlowFile that was used to trigger the execution. You cannot use a regular expression that will return more then one target file. So wildcards are not going to be supported here.

-

The FetchFTP processor is most commonly used in conjunction with the listFTP processor. The ListFTP processor connects to the target FTP server to retrieve a listing of all Files (based on processor filter configurations). The output of this processor will be a unique 0 byte FlowFile which contains a number of FlowFile Attributes which can then be used by the FetchFTP processor to retrieve the actual content.

-

This two phase approach has to purposes:

1. The ListFTP processor maintains state so the same files are not listed more then once. This allows you to leave the target files in the source directory without them being consumed by FetchFTP processor more then once.

2. This setup allows you to spread the load of pulling large amounts of data from FTP across multiple nodes in a NiFI cluster.

----- ListFTP processor would be configured to execute on Primary Node only.

----- ListFTP feeds 0 byte FlowFiles to an Remote Process Group (RPG) that is used to redistirbute those 0 byte FlowFiles to all the nodes in your NiFi cluster.

----- The FetchFTP processor would then be configured to execute on all nodes in your cluster, so every node does work retrieving different files and performing follow-on NiFi processor work on them.

-

Thank you,

Matt

-

When an "Answer" addresses/solves your question, please select "Accept" beneath that answer. This encourages user participation in this forum.

View solution in original post

4 REPLIES 4

avatar
Master Mentor

@Vladislav Shcherbakov

-

The FetchFTP processor is designed to retrieve a single File per execution. The data returned is added as content to the FlowFile that was used to trigger the execution. You cannot use a regular expression that will return more then one target file. So wildcards are not going to be supported here.

-

The FetchFTP processor is most commonly used in conjunction with the listFTP processor. The ListFTP processor connects to the target FTP server to retrieve a listing of all Files (based on processor filter configurations). The output of this processor will be a unique 0 byte FlowFile which contains a number of FlowFile Attributes which can then be used by the FetchFTP processor to retrieve the actual content.

-

This two phase approach has to purposes:

1. The ListFTP processor maintains state so the same files are not listed more then once. This allows you to leave the target files in the source directory without them being consumed by FetchFTP processor more then once.

2. This setup allows you to spread the load of pulling large amounts of data from FTP across multiple nodes in a NiFI cluster.

----- ListFTP processor would be configured to execute on Primary Node only.

----- ListFTP feeds 0 byte FlowFiles to an Remote Process Group (RPG) that is used to redistirbute those 0 byte FlowFiles to all the nodes in your NiFi cluster.

----- The FetchFTP processor would then be configured to execute on all nodes in your cluster, so every node does work retrieving different files and performing follow-on NiFi processor work on them.

-

Thank you,

Matt

-

When an "Answer" addresses/solves your question, please select "Accept" beneath that answer. This encourages user participation in this forum.

avatar
Contributor

Thanks, but it's not really what I wanted to know...

What I need to specify in the field "remote file"?

*and I use ListFTP before FetchFTP

77724-11.jpg77725-112.jpg

77726-113.jpg

avatar
Master Mentor

@Vladislav Shcherbakov

-

The ListFTP processor generates the following FlowFile attributes on each 0 byte FlowFile it generates:

77728-screen-shot-2018-06-18-at-105759-am.png

so you would typically use NiFi's Expression Language (EL) to define values for those properties dynamically per each processed FlowFile:

For example:

The property "Remote File" on the FetchFTP processor set a value of "${filename}". With each FlowFile received it will return the value assigned to this attribute and use it to retrieve the correct FlowFiles content from the target FTP server.

-

Thank you,

Matt

avatar
Contributor

String "${path}/${filename}" decided my problem.