01-25-2017 07:32 PM
When i'm running DistCp to copy data between cluster, almost all the mappers finished in minutes to hour and the last one taking more than 40 hours.
The Listing includes already files that copied to the other cluster and new ones that needed to copy.
The file size is different some is GB and other are KB to MB.
01-25-2017 08:07 PM
02-03-2017 12:23 PM
I'm trying to run DistCp first run, by creating snapshot S0 in the source and DistCp this S0 to the backup cluster, but since the DistCp'ed folder contain more than 3,000,000 files and 70 T, the running DistCp log is flooding the application master local file system, Is there a way to solve this, as a work around i'm thinking to DistCp the subfolder separetly, then creating the S0 snapshot in the source and distCped it. Any other smark ideas?