Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Trigger GetSFTP processor

Solved Go to solution

Trigger GetSFTP processor

New Contributor

I am using GetSFTP processor to get files from Filezilla and PutFile processor to move the file to a directory in my local

machine.

Is there a way to trigger GetSFTP processor every time a new file is uploaded.?

I am trying to create a flow where - every time a new file is uploaded to the server I want it automatically downloaded to my local machine. Can this be done using nifi.?

Thanks.

 

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Trigger GetSFTP processor

Super Guru

@Shailuk

Schedule GetSFtp processor to run on Primary node with Run Schedule as 0 Sec then processor will try to run everypossible sec and pulls the file from configured directory.

**NOTE**

if we don't delete the file from the path then GetSFTP processor will pull the same file again and again because GetSFTP processor doesn't store the state.

109956-screen-shot-2019-07-22-at-103139-pm.png

Correct Approach:

Use ListSFTP + FetchSFTP processors and configure ListSFTP processor to run on primary node with Run schedule as 0 sec and this processor stores the state and runs incrementally by listing out only the newly added files in the directory.

FetchSFTP processor fetches the files from the directory and then use PutFile processor to store the files into Local machine.

5 REPLIES 5

Re: Trigger GetSFTP processor

Super Guru

@Shailuk

Schedule GetSFtp processor to run on Primary node with Run Schedule as 0 Sec then processor will try to run everypossible sec and pulls the file from configured directory.

**NOTE**

if we don't delete the file from the path then GetSFTP processor will pull the same file again and again because GetSFTP processor doesn't store the state.

109956-screen-shot-2019-07-22-at-103139-pm.png

Correct Approach:

Use ListSFTP + FetchSFTP processors and configure ListSFTP processor to run on primary node with Run schedule as 0 sec and this processor stores the state and runs incrementally by listing out only the newly added files in the directory.

FetchSFTP processor fetches the files from the directory and then use PutFile processor to store the files into Local machine.

Highlighted

Re: Trigger GetSFTP processor

Super Guru

@Shailuk

Could you give password in ListSFTP processor and then try to run the processor again?

Re: Trigger GetSFTP processor

New Contributor

@Shu_ashu 

This is my dataflow - 

https://drive.google.com/file/d/1SWtSAPKxRcgAWT7ca0dytePjfwMZIpgR/view?usp=sharing

 

The ListSFTP processor works fine. It lists all the files in the server. But FetchSFTP doesn't work as expected. I get comms.failure. My ListSFTP  and FetchSFTP configurations are the same. 

 

Also I had one more doubt - How does FetchSFTP gets its state from LISTSFTP.? Is there any additional configuration that has to be done.? I am just joining ListSFTP and FetchSFTP with relationship Success. Not doing anything else.

 

Please help. 

 

 

Re: Trigger GetSFTP processor

New Contributor

@Shu

I am able to connect to the server using Filezilla but unable to connect using ListSFTP processor. I get this error -


110034-ss-1.jpg


Here are my configurations -

110026-ss-2.jpg


110027-ss-3.jpg


110051-ss-4.jpg

Is it because Private Key Property is blank.? If yes, can you please tell what has to be done. The pem file is in my desktop (local machine). I gave the path to it but it says it invalid.

Thanks.

Re: Trigger GetSFTP processor

New Contributor

@Shu

Actually there is no password. Username is ec2-user. I am able to connect to the server using Filezilla. I did not enter any password when I connected using Filezilla. Just username and it worked. But I imported the pem file.

I also created password for my ec2 instance and tried. Still no luck.