Support Questions

Find answers, ask questions, and share your expertise

NodeManager fails to start - IO error: lock

avatar
Expert Contributor

One of our clients have asked us to move the prefixed log location to a different mounted point for all the service logs, for example the prefix log location for hdfs is moved from /var/log/hadoop to /hdp/logs/hadoop via api calls.

Everything restarted smoothly however only one NM is coming up out of 5, and a manual restart only works on the first NM.

All other NM are through the same error, below;

STARTUP_MSG: build = git@github.com:hortonworks/hadoop.git -r ef0582ca14b8177a3cbb6376807545272677d730; compiled by 'jenkins' on 2015-12-16T03:01Z STARTUP_MSG: java = 1.7.0_67 ************************************************************/ 2016-01-26 15:01:25,155 INFO nodemanager.NodeManager (LogAdapter.java:info(45)) - registered UNIX signal handlers for [TERM, HUP, INT] 2016-01-26 15:01:26,283 INFO recovery.NMLeveldbStateStoreService (NMLeveldbStateStoreService.java:initStorage(927)) - Using state database at /hdp/logs/hadoop-yarn/nodemanager/recovery-state/yarn-nm-state for recovery 2016-01-26 15:01:26,313 INFO service.AbstractService (AbstractService.java:noteFailure(272)) - Service org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService failed in state INITED; cause: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: lock /hdp/logs/hadoop-yarn/nodemanager/recovery-state/yarn-nm-state/LOCK: Resource temporarily unavailable org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: lock /hdp/logs/hadoop-yarn/nodemanager/recovery-state/yarn-nm-state/LOCK: Resource temporarily unavailable at org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200) at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218) at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168) at org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.initStorage(NMLeveldbStateStoreService.java:930) at org.apache.hadoop.yarn.server.nodemanager.recovery.NMStateStoreService.serviceInit(NMStateStoreService.java:204) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartRecoveryStore(NodeManager.java:178) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:220) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:537) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:585) 2016-01-26 15:01:26,316 INFO service.AbstractService (AbstractService.java:noteFailure(272)) - Service NodeManager failed in state INITED; cause: org.apache.hadoop.service.ServiceStateException: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: lock /hdp/logs/hadoop-yarn/nodemanager/recovery-state/yarn-nm-state/LOCK: Resource temporarily unavailable org.apache.hadoop.service.ServiceStateException: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: lock /hdp/logs/hadoop-yarn/nodemanager/recovery-state/yarn-nm-state/LOCK: Resource temporarily unavailable at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartRecoveryStore(NodeManager.java:178) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:220) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:537) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:585) Caused by: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: lock /hdp/logs/hadoop-yarn/nodemanager/recovery-state/yarn-nm-state/LOCK: Resource temporarily unavailable at org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200) at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218) at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168) at org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.initStorage(NMLeveldbStateStoreService.java:930) at org.apache.hadoop.yarn.server.nodemanager.recovery.NMStateStoreService.serviceInit(NMStateStoreService.java:204) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) ... 5 more 2016-01-26 15:01:26,317 FATAL nodemanager.NodeManager (NodeManager.java:initAndStartNodeManager(540)) - Error starting NodeManager org.apache.hadoop.service.ServiceStateException: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: lock /hdp/logs/hadoop-yarn/nodemanager/recovery-state/yarn-nm-state/LOCK: Resource temporarily unavailable at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartRecoveryStore(NodeManager.java:178) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:220) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:537) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:585) Caused by: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: lock /hdp/logs/hadoop-yarn/nodemanager/recovery-state/yarn-nm-state/LOCK: Resource temporarily unavailable at org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200) at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218) at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168) at org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.initStorage(NMLeveldbStateStoreService.java:930) at org.apache.hadoop.yarn.server.nodemanager.recovery.NMStateStoreService.serviceInit(NMStateStoreService.java:204) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) ... 5 more 2016-01-26 15:01:26,319 INFO nodemanager.NodeManager (LogAdapter.java:info(45)) - SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NodeManager at bvluxhdpdn05.conocophillips.net/158.139.121.115 ************************************************************/

"

As we can see it is not complaining about LOCK file bot present but unavailable, as whichever NM starts first acquire this LOCK (remember this is a single mount point and not local file-system)

If I change the log location back to local file-system even for example /tmp/yarnlogs its works smooth since all the NM get access to LOCK file on local file-system where ever they are installed.

Has someone faces this issue and can you please suggest a fix to this.

Thanks Mayank

1 ACCEPTED SOLUTION

avatar
Guru

Hi @mkataria ,

did I understand that correct, do all the Nodemanagers have kind of network storage mounted and want to write to /hdp/logs/hadoop-yarn/nodemanager/recovery-state/yarn-nm-state ?

This won't work since every NM wants to keep his own state in dir ..../yarn-nm-state, therefore just one NM can create the LOCK file there (besides the files keeping the state).

Logging to a central directory for that cases is difficult.

One solution could be to put each NM in a different config group, and specify the log directory for each config group, e.g.

/hdp/logs/hadoop-yarn/nm1

/hdp/logs/hadoop-yarn/nm2

...

Regards, Gerd

View solution in original post

5 REPLIES 5

avatar
Guru

Hi @mkataria ,

did I understand that correct, do all the Nodemanagers have kind of network storage mounted and want to write to /hdp/logs/hadoop-yarn/nodemanager/recovery-state/yarn-nm-state ?

This won't work since every NM wants to keep his own state in dir ..../yarn-nm-state, therefore just one NM can create the LOCK file there (besides the files keeping the state).

Logging to a central directory for that cases is difficult.

One solution could be to put each NM in a different config group, and specify the log directory for each config group, e.g.

/hdp/logs/hadoop-yarn/nm1

/hdp/logs/hadoop-yarn/nm2

...

Regards, Gerd

avatar
Expert Contributor

@Gerd Koenig I had same doubts thanks for confirming, can you share something on putting NM to diff, config groups at your leisure.

avatar
Guru

Hi @mkataria ,

sure, I'll try my best.

First click on service 'HDFS' in Ambari, then

1596-manage-config-groups.png

In the next dialog, create one config-group per Nodemanager , provide a corresponding name and assign that node to that config group

1597-add-config-group.png

Then get back to the "general" HDFS config page (picture 1), select a config group and adjust the log destination for that particular Nodemanager-node (==config-group).

...and restart HDFS 😉

Regards, Gerd

avatar
Master Mentor

@mkataria check the disk on the nodes, permissions, mount options, space, etc.

avatar
Expert Contributor

@Artem Ervits checked all of those and does not seems to be an issue