Reply
New Contributor
Posts: 5
Registered: ‎10-10-2017

DistCP Failures

[ Edited ]

I have some replication set up to copy the output of a daily process to another cluster, this uses:

 

hadoop distcp -update -delete $SOURCE $TARGET

 

However it occassionally fails (after mapping 100%!) with this error:

 

19/01/10 16:13:44 INFO mapreduce.Job: map 100% reduce 0%
19/01/10 16:15:51 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=FAILED. Redirecting to job history server
19/01/10 16:15:51 INFO mapreduce.Job: map 0% reduce NaN%
19/01/10 16:15:51 INFO mapreduce.Job: Job job_1546553376389_1538 failed with state FAILED due to:
19/01/10 16:15:51 ERROR tools.DistCp: Exception encountered
java.io.IOException: DistCp failure: Job job_1546553376389_1538 has failed:
at org.apache.hadoop.tools.DistCp.execute(DistCp.java:195)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:493)

 

I'm looking for some advice on how to investigate this problem, as I'm not completely sure where to start. Has anyone encountered something similar? What logs might have useful information for failed tasks like this?

Highlighted
Cloudera Employee
Posts: 47
Registered: ‎08-16-2016

Re: DistCP Failures

Without much context, you should go to YARN --> Resource Manager web UI, find the failed job corresponding to the distcp, and drill into it to find the failed reduce task. You should be able to find out more there in the log.