I have getSFTP processor which runs on 3 nodes. The getSFTP previously was running on primary node. As my node was not working properly I had to set to schedule to run on "all nodes" as a result I am receiving duplicate files.
Could you please let me know how to filter to take only one file from these 2 files(both are same files) and load into HDFS. Which means I have to put one file out of two duplicates to the data lake