Support Questions
Find answers, ask questions, and share your expertise

Yarn nodemanager fails to start : /var/log/hadoop-yarn/nodemanager/recovery-state/yarn-nm-state/: Is a directory

Yarn nodemanager fails to start : /var/log/hadoop-yarn/nodemanager/recovery-state/yarn-nm-state/: Is a directory

New Contributor

Hello,

Here is my problem : I have 7 nodemanagers running on my horton cluster. Three of them are sometime falling down (not simultaneously) and I don't really know why. When I check the logs I get the following message :

FATAL nodemanager.NodeManager (NodeManager.java:initAndStartNodeManager(549)) - Error starting NodeManager
org.apache.hadoop.service.ServiceStateException: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: /var/log/hadoop-yarn/nodemanager/recovery-state/yarn-nm-state/: Is a directory
at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartRecoveryStore(NodeManager.java:178)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:220)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:546)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:594)
Caused by: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: /var/log/hadoop-yarn/nodemanager/recovery-state/yarn-nm-state/: Is a directory
at org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200)
at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218)
at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168)
at org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.openDatabase(NMLeveldbStateStoreService.java:966)
at org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.initStorage(NMLeveldbStateStoreService.java:953)
at org.apache.hadoop.yarn.server.nodemanager.recovery.NMStateStoreService.serviceInit(NMStateStoreService.java:200)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
... 5 more

The solution found for the moment is to delete the file in order for yarn to restart, which does indeed work, it restarts fine. My question is : what can I do to avoid the error from happening again ? It is really annoying to have to do this a couple of times per day .

Thank you in advance for your help !