I have a scenario where I have to move folder with bunch of files from one server to another server. How to make sure that all files have been moved? The issue is that we can make files visible only after all files have been moved to destination folder.
Your use case raises a lot of questions.
Is this a one time move of a single folder? If this is a one time move, Nifi may not be the fastest solution here. NiFi is designed primarily for efficient continuously running dataflows. NiFi has guaranteed delivery mechanisms and data tracking through built-in provenance. NiFi is also data agnostic. It accomplishes this by wrapping the bits of content in a FlowFile. There will be some overhead associated with these things.
What does a bunch of files mean? Are you talking a few hundred, thousand, or million files?
It is good to start by asking yourself "how would you do this without using NiFi?".
- NiFi could be installed locally to the system where the files exist and use the listFile and FetchFile processors to consume the files. then push those files to the target system.
- NiF could reside on target system and retrieve the files from the source files via listSFTP and FetchSFTP processors (just examples)
As far as making sure all files are moved, NiFi list based processors maintain state on what files have been listed. As long as NiFi has access to all the files in the source folder, it will list all of them. The corresponding fetch processor will then pull the content for each of those files (it has a failure relationship that can be routed for retry in the event a files fails to be fetched).
Do you have a known count of source files?
What do you mean by visible? Visible to whom? This is where it becomes a little more tricky. Perhaps you could use NiFi to tar all the files together, Then move that tar to the new location and after successfully writing that tar or zip, use NIFi to execute a script on the target to unpack the tar or zip file.
If you found this answer addressed your question, please take a moment to login and click "accept" on the answer.
Thank you for you answer!
We have digitalization process. E.g right after a book digitalization (OCR) is complete it has about 100-1000 tiff files which we need to move to another server.
Each book will have its own folder and NIFI can start moving these files only when the OCR process has finished. We can probably solve this by moving a folder with files when the OCR process is complete, to make it visible to NIFI.
From the destination server there is archiving software and archivist could only start with archiving when all files have been copied (made visible to her/him thru programm) otherwise result will be incomplete if he/she start archiving too early.
From your proposed solution zipping files might actually work.