Support Questions

Find answers, ask questions, and share your expertise

How to fetch files form a SFTP location with name appended with todays date, facing problem with the regular expression when trying with GetSFTP processor in NIFI

avatar

I am very new to nifi, I have a requirement to pull specific csv files from the SFTP location, when I pass the full file name it works perfectly, but when I try to write in the regular expression and fetch the file dynamically then i am facing problem, I tried the regular expression ${now():format('yyyy')} and changed the file name to just 2017, even then it is not fetching the file. can somebody help me on this

12 REPLIES 12

avatar

actual regex written in the configuration15781-capture.png

avatar
Master Mentor
@Anoop Shet

That particular property (File Filter Regex) does not support NiFi Expression Language (EL).

15782-screen-shot-2017-05-25-at-125012-pm.png

If you float your cursor over the "?" displayed next to a processor property, you will see a line that says whether EL is supported or not.

This particular property only supports Java regular Expressions as input.

Thanks,

Matt

avatar
Master Mentor

@Anoop Shet

The following Java Regular expression would match all years from 2017 to 2099:

^.*20([1-9]{1}[7-9]{1}|[2-9]{1}[0-9]{1})$

Thanks,

Matt

avatar
Master Mentor

@Anoop Shet

I am confused... The "File Filter Regex" property in both the GetSFTP and ListSFTP processors work the same way and do not support NiFi expression language.

The List type processors are used in conjunction with the corresponding Fetch processor to pull data.

The FetchSFTP processor is designed to fetch the content of one File at a time and insert that data in to the FlowFile that triggered the Fetch processor to run.

While you can certainly use listSFTP to fetch a listing of all Files on your SFTP server and then use a routeOnAttribute processor to only pick out those with the current year in it, the java regular expression i provided should work as well.

Here is an article on GetSFTP vs list/fetchSFTP processors:

https://community.hortonworks.com/articles/97773/how-to-retrieve-files-from-a-sftp-server-using-nif....

Thanks,

Matt

avatar

Thanks for your prompt reply, I am very new to Nifi and still have some queries can you please suggest me a processor to use when I want to read a file dynamically from a SFTP location on day to day basis and put it on to another SFTP location, based on date (today filename25052017.csv, tomorrow it would be filename26052017.csv) is there any alternative to use apart from the Expression language , which provide us the date

avatar
Master Mentor

@Anoop Shet

So the ListSFTP and FetchSFTP processor will be your best choice here. You can still use the Java regular expression I provided to filter only on files ending in years 2017 - 2099. Why this is your best option is because the ListSFTP processor retains state. This means that when it runs today, it will list all files ending in 2017. It will then record state of the most current File listed in the form of the lastModified timestamp on that file. When the processor runs again it will only look to list any files with a newer timestamp then what was previously recorded in state management (while still applying regex). The listSFTP processor produces one 0 byte FlowFile for each file listed from the SFTP server. These 0 byte FlowFiles have numerous FlowFile attributes created on them with some of them being used by default by the FetchSFTP processor to actual retrieve the content from the SFP server and insert it into that FlowFile.

As far as an alternative to EL, does the Java regular expression I provided not work for you? There are plenty of resource on the web for writing and even testing java regular expressions. Once data is ingested by NiFi as FlowFiles, you can use NiFi's EL to evaluate and route FlowFiles.

Thank you,

Matt

avatar

Hi Matt,

Thanks again for your detailed response. it was very informative.

I have one more query now, I am able to fetch only previous days files, how can i get the files which are like 4 days or 5 days old. my expression is like this

${absolute.path}/Bell_LTECells_${now():toNumber():minus(86400000):format('ddMMyyyy')}.csv ( this works without any error)

but if I want to get the files which are backdated to 3 to 4 days..l tried this..

${absolute.path}/Bell_LTECells_${now():toNumber():minus(345600000):format('ddMMyyyy')}.csv

and this dosent seem to work.

avatar
Master Mentor

@Anoop Shet

Sorry for the late response, but i don't get pinged unless you tag me in your response.

The ListSFTP processor retains state on Files that have been listed. My guess here is the state is preventing these new filter form returning anything. Try clearing the state and see if it then lists the files based on your new filter or add a new ListSFTP processor using that different file filter.

You can right click on the processor and select "View state". In the state UI for this processor you will see a link to "Clear state".

If you found my answer addressed your original question, please mark as accepted to close out thsi thread.

Thanks,

Matt

avatar
Explorer

@MattWho Hey Matt , I had a query related to this as well since I am very new to nifi . I had 2 different set of files in a Same sftp location but have to use 2 different regex filter . Is it possible to do this in one list sftp or can I use 2 list sftp for these ? If it’s possible could you please tell how ? 
Also Another query is that , If I clubbed two list sftp whose remote host is of two different servers to an output port connected and then to remote process group . Could you please tell what to fill under the remote host field in fetch sftp in the output processor group .(like I connected 2 list sftp having 2 different source servers to a single fetch sftp) is this possible ?