Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Please see the Cloudera blog for information on the Cloudera Response to CVE-2021-4428

Using GetFTP with correct credentials does not pull files

New Contributor

I have an FTP server that I can connect to manually. Inside my FTP server, I have to navigate to the directory /amazon/jungle/ where there are a bunch of tree manifest files that end in .dat.

ParamValue
Password*******
Usernamepassword
Hostnamehostname.site.com
Transfer Mode Binary
Remote Path

/amazon/jungle/

File Filter Regex .*
Delete original False

I have tried linking this component to either PutFile and configured a local directory "/tmp/" and tried PutS3. When I run my pipeline, I cannot get any of my files from the FTP server to download locally or push into S3.

Any idea what I may be doing wrong? Is there a way to see what is causing the issue?

5 REPLIES 5

@Rooster Raul

Can you please put in some screenshot of how your properties looks like. Against which property was the directory "/amazon/jungle/" configured? Is it against the 'Remote Path' property.

Anyways here is the quick check list property.

1. Remote Path: /amazon/jungle/

2. File Filter Regex: Here, you need to put in a RegEx that matches your file name. You can make use of this link to check if your RegEx exactly matches the file name (http://regexr.com/)

New Contributor

Hi @Balakrishnan Ramasamy, thank you for the quick response. I updated my question to make the parameters more clear and I'm using a very global regular expression .* to pull all files that are in my remote path.

Hi @Rooster Raul. Your properties looks good. Can you please check if the folder you are trying to access have necesssary permissions for the user credentials. Becasue that is when you wont see any error during connection and everything stays idle thinking that there is no files to be listed.

New Contributor

The local directory is /tmp/ and Nifi has permissions to access this. What I'm seeing is that there is no data coming in from GetFTP even though the number of tasks increases. I've checked the logs and I cannot see clearly why this is happening. This is why even if I change the output of GetFTP from PutFile to PutS3 or whatever, nothing is being created. The question is, why is my GetFTP component unable to retrieve files from the FTP server? and how I can dig into why?

Master Guru

@Rooster Raul

When you manually log in to your ftp server and run the "pwd" command, where are you sitting?

Are your files really located at /amazon/jungle/* or are they sitting at <ftp home>/amazon/jungle/* ?

-- Try removing the leading "/" from your "Remote path" property value.

When you start the NiFi GetFTP processor, what do you see in your ftp server logs? Do you see the connection from NiFi?

What is your GetFTP processor's "Scheduling" configuration?

Thanks,

Matt