Support Questions

ahnfelt · ‎10-13-2015

Using CDH 5.4.7-1.cdh5.4.7.p0.3, when I run multiple mapreduce jobs one after another, eventually one of the jobs will fail with this stack trace:

2015-10-13 14:22:28,187 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: { hdfs://hdfs-nameservice/user/hdfs/.staging/job_1444734646472_0003/libjars/htrace-core-3.1.0-incubating.jar, 1444738926587, FILE, null } failed: Rename cannot overwrite non empty destination directory /yarn/nm/usercache/hdfs/filecache/945
java.io.IOException: Rename cannot overwrite non empty destination directory /yarn/nm/usercache/hdfs/filecache/945
 at org.apache.hadoop.fs.AbstractFileSystem.renameInternal(AbstractFileSystem.java:716)
 at org.apache.hadoop.fs.FilterFs.renameInternal(FilterFs.java:228)
 at org.apache.hadoop.fs.AbstractFileSystem.rename(AbstractFileSystem.java:659)
 at org.apache.hadoop.fs.FileContext.rename(FileContext.java:909)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:364)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 
2015-10-13 14:22:28,188 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource hdfs://hdfs-nameservice/user/hdfs/.staging/job_1444734646472_0003/libjars/htrace-core-3.1.0-incubating.jar(->/yarn/nm/usercache/hdfs/filecache/945/htrace-core-3.1.0-incubating.jar) transitioned from DOWNLOADING to FAILED
2015-10-13 14:22:28,188 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_e10_1444734646472_0003_01_000001 transitioned from LOCALIZING to LOCALIZATION_FAILED
2015-10-13 14:22:28,188 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl: Container container_e10_1444734646472_0003_01_000001 sent RELEASE event on a resource request { hdfs://hdfs-nameservice/user/hdfs/.staging/job_1444734646472_0003/libjars/htrace-core-3.1.0-incubating.jar, 1444738926587, FILE, null } not present in cache.
2015-10-13 14:22:28,188 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Unknown localizer with localizerId container_e10_1444734646472_0003_01_000001 is sending heartbeat. Ordering it to DIE

The number in the path varies, but restarting the failed job does not get rid of the error.

I have tried to set set the yarn.nodemanager.localizer.cache.target-size-mb to 0, restarting YARN, and waiting until after the cleanup, but it doesn't help.

The file /yarn/nm/usercache/hdfs/filecache/812 does not seem to exist before/after running the job.

Has anybody experienced this, or have an explanation as to why it happens?

ahnfelt · ‎10-19-2015

The error was initially encountered in an older version of CDH, and it disappeared when we also updated the client to the same version.

View solution in original post

ahnfelt · ‎10-19-2015

The error was initially encountered in an older version of CDH, and it disappeared when we also updated the client to the same version.

cjervis · ‎10-19-2015

Congratulations on solving your issue. Feel free to mark your previous response as the solution to the issue in case it can help others in the future.

Cy Jervis, Manager, Community Program
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Cloudera Community

Support Questions

Rename cannot overwrite non empty destination directory /yarn/nm/usercache/hdfs/filecache/812