Member since
06-05-2019
2
Posts
0
Kudos Received
0
Solutions
07-04-2019
08:38 PM
We reproduced the problem - we don't have an issue with deleting files we have an issue in listing all files. We are using a bunch of folders with a bunch of listfile processors. Some files are not covered. When we reset the state of each listfile processor all files in the shares (untouched) are identified by the listfile processors. But after a while they again miss new files on the share. As a workaround, we scripted the cleaning of the state of all processors every five minutes. And as long as the pipeline is fast enough it does not create backpressure on thousand of files. We think that we give Entity tracking and redis cache integration for ListFile processors a try next to remove this hack again.
... View more
06-05-2019
12:17 PM
I am using Apache NiFi to process a huge amount of CSV files. Within the process, I identify if this file is valid or not - in both ways I want to delete the file. Either if it is not needed or after the finished processing. For this I use the ExecuteStreamCommand processor with the following configuration: Command Arguments|-f;${absolute.path}${filename}
Command Path|/bin/rm
Ignore STDIN|true
Working directory|not set
Argument delimiter|;
Output destination attribute|not set
Max attribute length|256 The process indeed works and delete files wherever this processor is integrated. But in the real system with 1500 files per hour only approx 30% of the files get deleted. This leads to a full file share and the system stops working because no further data arrives. The odd thing - I don't get any exception in the logs. Does anybody know why this is not working properly?
... View more
Labels:
- Labels:
-
Apache NiFi