Created 07-12-2018 08:04 AM
I am not able to restart nodemanager , please find errors in the log :
Caused by: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: /var/log/hadoop-yarn/nodemanager/recovery-state/yarn-nm-state/000002.dbtmp: No space left on device at org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200) at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218) at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168) at org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.openDatabase(NMLeveldbStateStoreService.java:973) at org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.initStorage(NMLeveldbStateStoreService.java:953) at org.apache.hadoop.yarn.server.nodemanager.recovery.NMStateStoreService.serviceInit(NMStateStoreService.java:200) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) ... 5 more 2018-07-12 09:52:18,186 INFO nodemanager.NodeManager (LogAdapter.java:info(45)) - SHUTDOWN_MSG:
however there is space in the device , I removed yarn-nm-state to /tmp and restarted the nodemanager but other two nodemanager started but still one of the nodemanager is not starting .
Created 07-12-2018 08:07 AM
The error suggests that your host is sufferring from Disk Space Issue.
IO error: /var/log/hadoop-yarn/nodemanager/recovery-state/yarn-nm-state/000002.dbtmp: No space left on device
So you will need to make sure that the mentioned partition has enough disk free.
Can you please share the output of the following command?
# df -h # du -csh /var
Created 07-12-2018 12:07 PM
/dev/mapper/vg1-logvol 60G 37G 20G 66% /var/log
Created 07-12-2018 12:15 PM
If you are noticing "No space left on device" in some cloud environment / OpenStack ..etc or on some VM then i will suggest rebooting the host VM once.
Because some times even if the VM has enough space (based on df -h output) still the VM is not allocate space for the apps
Please try rebooting the host and then you might not see the same error.