Created on 06-18-2018 01:11 PM - edited 08-17-2019 06:52 PM
I got a new problem... - can't understand what need to enter and in what format into "Remote File" in "FetchFTP" ...
Created 06-18-2018 01:32 PM
-
The FetchFTP processor is designed to retrieve a single File per execution. The data returned is added as content to the FlowFile that was used to trigger the execution. You cannot use a regular expression that will return more then one target file. So wildcards are not going to be supported here.
-
The FetchFTP processor is most commonly used in conjunction with the listFTP processor. The ListFTP processor connects to the target FTP server to retrieve a listing of all Files (based on processor filter configurations). The output of this processor will be a unique 0 byte FlowFile which contains a number of FlowFile Attributes which can then be used by the FetchFTP processor to retrieve the actual content.
-
This two phase approach has to purposes:
1. The ListFTP processor maintains state so the same files are not listed more then once. This allows you to leave the target files in the source directory without them being consumed by FetchFTP processor more then once.
2. This setup allows you to spread the load of pulling large amounts of data from FTP across multiple nodes in a NiFI cluster.
----- ListFTP processor would be configured to execute on Primary Node only.
----- ListFTP feeds 0 byte FlowFiles to an Remote Process Group (RPG) that is used to redistirbute those 0 byte FlowFiles to all the nodes in your NiFi cluster.
----- The FetchFTP processor would then be configured to execute on all nodes in your cluster, so every node does work retrieving different files and performing follow-on NiFi processor work on them.
-
Thank you,
Matt
-
When an "Answer" addresses/solves your question, please select "Accept" beneath that answer. This encourages user participation in this forum.
Created 06-18-2018 01:32 PM
-
The FetchFTP processor is designed to retrieve a single File per execution. The data returned is added as content to the FlowFile that was used to trigger the execution. You cannot use a regular expression that will return more then one target file. So wildcards are not going to be supported here.
-
The FetchFTP processor is most commonly used in conjunction with the listFTP processor. The ListFTP processor connects to the target FTP server to retrieve a listing of all Files (based on processor filter configurations). The output of this processor will be a unique 0 byte FlowFile which contains a number of FlowFile Attributes which can then be used by the FetchFTP processor to retrieve the actual content.
-
This two phase approach has to purposes:
1. The ListFTP processor maintains state so the same files are not listed more then once. This allows you to leave the target files in the source directory without them being consumed by FetchFTP processor more then once.
2. This setup allows you to spread the load of pulling large amounts of data from FTP across multiple nodes in a NiFI cluster.
----- ListFTP processor would be configured to execute on Primary Node only.
----- ListFTP feeds 0 byte FlowFiles to an Remote Process Group (RPG) that is used to redistirbute those 0 byte FlowFiles to all the nodes in your NiFi cluster.
----- The FetchFTP processor would then be configured to execute on all nodes in your cluster, so every node does work retrieving different files and performing follow-on NiFi processor work on them.
-
Thank you,
Matt
-
When an "Answer" addresses/solves your question, please select "Accept" beneath that answer. This encourages user participation in this forum.
Created on 06-18-2018 01:45 PM - edited 08-17-2019 06:52 PM
Thanks, but it's not really what I wanted to know...
What I need to specify in the field "remote file"?
*and I use ListFTP before FetchFTP
Created on 06-18-2018 03:10 PM - edited 08-17-2019 06:51 PM
-
The ListFTP processor generates the following FlowFile attributes on each 0 byte FlowFile it generates:
so you would typically use NiFi's Expression Language (EL) to define values for those properties dynamically per each processed FlowFile:
For example:
The property "Remote File" on the FetchFTP processor set a value of "${filename}". With each FlowFile received it will return the value assigned to this attribute and use it to retrieve the correct FlowFiles content from the target FTP server.
-
Thank you,
Matt
Created 06-18-2018 03:09 PM
String "${path}/${filename}" decided my problem.