I am using NIFI to read data from FTP and push it to HDFS. My business process is to schedule workflows for subsequent computing tasks through dophinschedule. One detail here is that my workflow will execute Alter for corresponding analysis before running.
I have noticed that the Spark task occasionally encounters a FileNotFound error during runtime, which may cause the task to fail. It is speculated that this is because partition information has already been added. At this time, when the task is running, NIFI is still writing data to the corresponding partition, and the file being written will cause this error. How to optimize this problem?
