- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Reset ListSFTP state for entity on FetchSFTP failure
- Labels:
-
Apache NiFi
Created on
‎03-03-2020
02:24 PM
- last edited on
‎03-03-2020
04:02 PM
by
ask_bill_brooks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am looking for a feedback mechanism to inform the ListSFTP processor that a transfer has failed so that it attempts to list the file again. This is an attempt for auto resolution when network issues prevent successful transfers. I was hoping to be able to use a combination of fetch/put distributed map cache and update attribute to clear the file.lastModifiedTimestamp attribute for failed flowfiles from the FetchSFTP but it seems that the distributed cache is meant only for migration of old NiFi releases.
Is there any flow that would accomplish what we are looking for?
Created ‎03-04-2020 08:08 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Could we use an ExecuteProcessor to access the statemanager which can remove the state for a particular file that has failed. So upon FetchSFTP failure, send a flow file to an ExecuteProcessor to reach into the state and remove the entry for the file?
Created ‎03-04-2020 09:28 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If the listSFTP processor fails during listing, no FlowFiles should have been output and the state should not have been updated. Are you seeing failure during the listSFTP processor execution?
If you are seeing FlowFiles getting routed to one of the failure relationships from the FetchSFTP processor, you can always loop that connection back to the same FetchSFTP processor so another attempt is made to fetch the content for that FlowFile.
There currently does not exist a may to clear just a single cached entry from the DistributedMapCacheServer controller service. I encourage you to open an Apache NiFi Jira for a new processor that can remove cache entries.
https://issues.apache.org/jira
You could try looking at this example for removing a cache entry via a script:
https://gist.github.com/ijokarumawak/14d560fec5a052b3a157b38a11955772
Hope this helps,
Matt
Created ‎03-04-2020 10:53 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Appreciate the feedback. Just a little more context if it helps...
It's not that the ListSFTP fails but if the FetchSFTP fails to fetch what the ListSFTP provides, we aren't able to inform the ListSFTP that the file should be listed again. This may occurred due to a network outage during the transfer.
The problem with the retry approach for FetchSFTP is that if the file is actually removed from the slave, we don't want to try it again.
What we're trying to accomplish is something similar to rsync for multiple slaves feeding into a master server. The files need to remain on the slaves and the master should always reflect what is on the slaves. If FetchSFTP fails, then we would have outdated files on the master server.
Created ‎03-04-2020 01:39 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The FetchSFTP processor has multiple different relationships.
For your use case of the file really not being there when the FetchSFTP tries to fetch the content, the expected outcome would be that the FlowFile is routed to the "not.found" relationship which you should auto-terminate,
If you encountered some sort of communications failure (network issue during Fetch), the FlowFile should have been routed to the "comms.failure" relationship which should be looped back on processor to try again.
The FetchSFTP also has a "permission.denied" relationship which you can perhaps handle via dataflow design as well. Perhaps sending an email alert?
Hope this helps,
Matt
