Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Map Reduce Job fails (exited with exitCode: -1000)

avatar
Explorer

I cannot figure out which local path is out of space

I have a 3 node cluster. Each node has 1TB boot disk plus 1 TB X 2 attached 7200rpm disks. 

The directory structure on each of the two disks on each of the three nodes are
/media/sanjay/hdd03/yarn/nm
/media/sanjay/hdd03/dfs/dn
/media/sanjay/hdd03/dfs/snn
/media/sanjay/hdd03/dfs/snn
/media/sanjay/hdd02/yarn/nm
/media/sanjay/hdd02/dfs/dn
/media/sanjay/hdd02/dfs/snn

df -h /media/sanjay/hdd03
Filesystem Size Used Avail Use% Mounted on
/dev/sdc1 917G 228M 870G 1% /media/sanjay/hdd03

df -h /media/sanjay/hdd02
Filesystem Size Used Avail Use% Mounted on
/dev/sdb1 917G 1.2G 869G 1% /media/sanjay/hdd02

df -h /tmp
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/ubuntu--vg-root 915G 41G 828G 5% /


 

2023-01-07 22:15:31,486 ERROR org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Local path for public localization is not found.  May be disks failed.
org.apache.hadoop.util.DiskChecker$DiskErrorException: No space available in any of the local directories.
	at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:400)
	at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:152)
	at org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService.getLocalPathForWrite(LocalDirsHandlerService.java:589)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.addResource(ResourceLocalizationService.java:883)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.handle(ResourceLocalizationService.java:781)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.handle(ResourceLocalizationService.java:723)
	at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197)
	at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126)
	at java.lang.Thread.run(Thread.java:750)
2023-01-07 22:15:31,486 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Created localizer for container_1673158307170_0001_01_000001
2023-01-07 22:15:31,488 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Localizer failed for container_1673158307170_0001_01_000001
org.apache.hadoop.util.DiskChecker$DiskErrorException: No space available in any of the local directories.
	at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:400)
	at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:152)
	at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:133)
	at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:117)
	at org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService.getLocalPathForWrite(LocalDirsHandlerService.java:584)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1205)
2023-01-07 22:15:31,488 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1673158307170_0001_01_000001 transitioned from LOCALIZING to LOCALIZATION_FAILED
2023-01-07 22:15:31,489 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl: Container container_1673158307170_0001_01_000001 sent RELEASE event on a resource request { hdfs://hp8300one:8020/user/yarn/mapreduce/mr-framework/3.0.0-cdh6.3.4-mr-framework.tar.gz, 1672446065301, ARCHIVE, null } not present in cache.
2023-01-07 22:15:31,489 WARN org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=sanjay	OPERATION=Container Finished - Failed	TARGET=ContainerImpl	RESULT=FAILURE	DESCRIPTION=Container failed with state: LOCALIZATION_FAILED	APPID=application_1673158307170_0001	CONTAINERID=container_1673158307170_0001_01_000001
2023-01-07 22:15:31,490 WARN org.apache.hadoop.util.concurrent.ExecutorHelper: Execution exception when running task in DeletionService #2
2023-01-07 22:15:31,490 WARN org.apache.hadoop.util.concurrent.ExecutorHelper: Caught exception in thread DeletionService #2: 
java.lang.NullPointerException: path cannot be null
	at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204)
	at org.apache.hadoop.fs.FileContext.fixRelativePart(FileContext.java:270)
	at org.apache.hadoop.fs.FileContext.delete(FileContext.java:768)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.deletion.task.FileDeletionTask.run(FileDeletionTask.java:109)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)
3 REPLIES 3

avatar
Explorer

This issue stated here is pretty much same as what I am facing

https://stackoverflow.com/questions/71684661/no-disk-space-allocated-to-hdfs-filesystem

avatar
Expert Contributor

May I know what values you have set for below properties:

yarn.nodemanager.local-dirs

yarn.nodemanager.log-dirs

 

Also, please make sure you don't have noexec or nosuid flags set on the corresponding disk. You may check this using "mount" command. 

avatar
Community Manager

@sanjaysubs, Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.  



Regards,

Vidya Sargur,
Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community: