- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
DistCP Failures
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have some replication set up to copy the output of a daily process to another cluster, this uses:
hadoop distcp -update -delete $SOURCE $TARGET
However it occassionally fails (after mapping 100%!) with this error:
19/01/10 16:13:44 INFO mapreduce.Job: map 100% reduce 0%
19/01/10 16:15:51 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=FAILED. Redirecting to job history server
19/01/10 16:15:51 INFO mapreduce.Job: map 0% reduce NaN%
19/01/10 16:15:51 INFO mapreduce.Job: Job job_1546553376389_1538 failed with state FAILED due to:
19/01/10 16:15:51 ERROR tools.DistCp: Exception encountered
java.io.IOException: DistCp failure: Job job_1546553376389_1538 has failed:
at org.apache.hadoop.tools.DistCp.execute(DistCp.java:195)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:493)
I'm looking for some advice on how to investigate this problem, as I'm not completely sure where to start. Has anyone encountered something similar? What logs might have useful information for failed tasks like this?
Created ‎01-10-2019 06:30 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Without much context, you should go to YARN --> Resource Manager web UI, find the failed job corresponding to the distcp, and drill into it to find the failed reduce task. You should be able to find out more there in the log.
