Created on 10-23-2017 11:34 PM - last edited on 10-02-2019 10:35 AM by ask_bill_brooks
I would like to use NIFI to retreive files from external SFTP and store on local disk with RAID 10 and retrieve back whenever i need it. is it possible ? That means pretty much i want replace SAN or Isilon type of storage and use NIFI as a processing engine and storage engine.
Created 10-24-2017 04:37 AM
NiFi was designed to move data from one place to another, not to store it. NiFi stores data in content repository temporarily for the processing but has routines to delete flow files automatically. Data is deleted just after the end of the flow or after an archive retention period which 12 hours by default. This article explains the archiving process : https://community.hortonworks.com/articles/82308/understanding-how-nifis-content-repository-archivi....
As you can see, NiFi is designed to delete data that's not anymore used. The idea behind is that NiFi moved it to a storage location.
You should use storage solution for storing data not NiFi. For instance, why don't you use your FTP server for this?
Created 10-02-2019 07:19 AM
Maybe do you are looking for SSoT (Single Source of Thruth). Kafka may be the best option to achieve this concept. The link bellow may help you:
https://www.confluent.io/blog/messaging-single-source-truth/
Created 10-02-2019 07:52 AM
It is absolutely possible to do this. However somethings need to be considered:
In past projects I have used primary node, with a separate partition to storing files local to NiFi Primary Node. These files are then used outside of NiFi for other purposes. In some projects these files are picked up in NiFi in separate flows, and then re-distributed into the cluster for processing across all nodes. The primary use case here was audit received files directly to disk by Team 1. Some time later Team 2 access files for processing. In this sample Team 1 and Team 2 are completely separate with Security Group based access to nifi (they cannot see each others flows).