02-06-2017 12:56 PM
I'm trying to run DistCp first run, by creating snapshot S0 in the source and DistCp this S0 to the backup cluster, but since the DistCp'ed folder contain more than 3,000,000 files and 70 T, the running DistCp log is flooding the application master local file system, Is there a way to solve this, as a work around i'm thinking to DistCp the subfolder separetly, then creating the S0 snapshot in the source and distCped it. Any other smart ideas?
02-12-2017 01:18 AM
so no way we can save the log at the HDFS instead the local file system for the application master?
Do you think of any other work around for this?
02-12-2017 01:52 AM
02-24-2017 01:06 AM - edited 02-24-2017 01:07 AM
@Harsh J if I understand correctly setting the root logger to Error or Fatal will likely to produce less local logs when performing Hadoop distcp assuming if everything goes nice and smooth.