I'm trying to load huge data consisting of 225 GB (no. of file ~1,75,000) from SFTP server and copying data to HDFS.
To implement above scenario we've used 2 processors.
1. GetSFTP (To get the files from SFTP server)
Configured Processor -> serach recursively = true ; use Natural Ordering = true ; Remote Poll Batch Size = 20000; concurrent tasks = 3
2.PutHDFS (Pushing the data to HDFS)
Configured Processor -> concurrent tasks = 3; Confict Resolution Strategy = replace ; Hadoop Configuration Resources; Directory
But after some time data copying is getting stopped and it's size is not updating in HDFS. But I can't seem to figure out what I'm doing wrong.