Thank you for the reply @Kezia . I was able to filter the duplicates using detect duplicate processor.
This is the error I'm getting when getsftp processor was scheduled on primary node
GetFTP[id=xxxx] Unable to fetch listing from remote server due to java.net.ConnectException: Connection timed out (Connection timed out): Connection timed out (Connection timed out)