Created 01-10-2023 07:42 PM
I cannot figure out which local path is out of space
I have a 3 node cluster. Each node has 1TB boot disk plus 1 TB X 2 attached 7200rpm disks.
The directory structure on each of the two disks on each of the three nodes are
/media/sanjay/hdd03/yarn/nm
/media/sanjay/hdd03/dfs/dn
/media/sanjay/hdd03/dfs/snn
/media/sanjay/hdd03/dfs/snn
/media/sanjay/hdd02/yarn/nm
/media/sanjay/hdd02/dfs/dn
/media/sanjay/hdd02/dfs/snn
df -h /media/sanjay/hdd03
Filesystem Size Used Avail Use% Mounted on
/dev/sdc1 917G 228M 870G 1% /media/sanjay/hdd03
df -h /media/sanjay/hdd02
Filesystem Size Used Avail Use% Mounted on
/dev/sdb1 917G 1.2G 869G 1% /media/sanjay/hdd02
df -h /tmp
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/ubuntu--vg-root 915G 41G 828G 5% /
2023-01-07 22:15:31,486 ERROR org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Local path for public localization is not found. May be disks failed. org.apache.hadoop.util.DiskChecker$DiskErrorException: No space available in any of the local directories. at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:400) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:152) at org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService.getLocalPathForWrite(LocalDirsHandlerService.java:589) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.addResource(ResourceLocalizationService.java:883) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.handle(ResourceLocalizationService.java:781) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.handle(ResourceLocalizationService.java:723) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126) at java.lang.Thread.run(Thread.java:750) 2023-01-07 22:15:31,486 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Created localizer for container_1673158307170_0001_01_000001 2023-01-07 22:15:31,488 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Localizer failed for container_1673158307170_0001_01_000001 org.apache.hadoop.util.DiskChecker$DiskErrorException: No space available in any of the local directories. at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:400) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:152) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:133) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:117) at org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService.getLocalPathForWrite(LocalDirsHandlerService.java:584) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1205) 2023-01-07 22:15:31,488 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1673158307170_0001_01_000001 transitioned from LOCALIZING to LOCALIZATION_FAILED 2023-01-07 22:15:31,489 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl: Container container_1673158307170_0001_01_000001 sent RELEASE event on a resource request { hdfs://hp8300one:8020/user/yarn/mapreduce/mr-framework/3.0.0-cdh6.3.4-mr-framework.tar.gz, 1672446065301, ARCHIVE, null } not present in cache. 2023-01-07 22:15:31,489 WARN org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=sanjay OPERATION=Container Finished - Failed TARGET=ContainerImpl RESULT=FAILURE DESCRIPTION=Container failed with state: LOCALIZATION_FAILED APPID=application_1673158307170_0001 CONTAINERID=container_1673158307170_0001_01_000001 2023-01-07 22:15:31,490 WARN org.apache.hadoop.util.concurrent.ExecutorHelper: Execution exception when running task in DeletionService #2 2023-01-07 22:15:31,490 WARN org.apache.hadoop.util.concurrent.ExecutorHelper: Caught exception in thread DeletionService #2: java.lang.NullPointerException: path cannot be null at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204) at org.apache.hadoop.fs.FileContext.fixRelativePart(FileContext.java:270) at org.apache.hadoop.fs.FileContext.delete(FileContext.java:768) at org.apache.hadoop.yarn.server.nodemanager.containermanager.deletion.task.FileDeletionTask.run(FileDeletionTask.java:109) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750)
Created 01-14-2023 04:58 PM
This issue stated here is pretty much same as what I am facing
https://stackoverflow.com/questions/71684661/no-disk-space-allocated-to-hdfs-filesystem
Created 01-30-2023 01:21 AM
May I know what values you have set for below properties:
yarn.nodemanager.local-dirs
yarn.nodemanager.log-dirs
Also, please make sure you don't have noexec or nosuid flags set on the corresponding disk. You may check this using "mount" command.
Created 02-02-2023 12:08 AM
@sanjaysubs, Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.
Regards,
Vidya Sargur,