Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

NodeManager fails to start - IO error: lock

avatar
Expert Contributor

One of our clients have asked us to move the prefixed log location to a different mounted point for all the service logs, for example the prefix log location for hdfs is moved from /var/log/hadoop to /hdp/logs/hadoop via api calls.

Everything restarted smoothly however only one NM is coming up out of 5, and a manual restart only works on the first NM.

All other NM are through the same error, below;

STARTUP_MSG: build = git@github.com:hortonworks/hadoop.git -r ef0582ca14b8177a3cbb6376807545272677d730; compiled by 'jenkins' on 2015-12-16T03:01Z STARTUP_MSG: java = 1.7.0_67 ************************************************************/ 2016-01-26 15:01:25,155 INFO nodemanager.NodeManager (LogAdapter.java:info(45)) - registered UNIX signal handlers for [TERM, HUP, INT] 2016-01-26 15:01:26,283 INFO recovery.NMLeveldbStateStoreService (NMLeveldbStateStoreService.java:initStorage(927)) - Using state database at /hdp/logs/hadoop-yarn/nodemanager/recovery-state/yarn-nm-state for recovery 2016-01-26 15:01:26,313 INFO service.AbstractService (AbstractService.java:noteFailure(272)) - Service org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService failed in state INITED; cause: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: lock /hdp/logs/hadoop-yarn/nodemanager/recovery-state/yarn-nm-state/LOCK: Resource temporarily unavailable org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: lock /hdp/logs/hadoop-yarn/nodemanager/recovery-state/yarn-nm-state/LOCK: Resource temporarily unavailable at org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200) at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218) at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168) at org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.initStorage(NMLeveldbStateStoreService.java:930) at org.apache.hadoop.yarn.server.nodemanager.recovery.NMStateStoreService.serviceInit(NMStateStoreService.java:204) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartRecoveryStore(NodeManager.java:178) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:220) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:537) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:585) 2016-01-26 15:01:26,316 INFO service.AbstractService (AbstractService.java:noteFailure(272)) - Service NodeManager failed in state INITED; cause: org.apache.hadoop.service.ServiceStateException: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: lock /hdp/logs/hadoop-yarn/nodemanager/recovery-state/yarn-nm-state/LOCK: Resource temporarily unavailable org.apache.hadoop.service.ServiceStateException: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: lock /hdp/logs/hadoop-yarn/nodemanager/recovery-state/yarn-nm-state/LOCK: Resource temporarily unavailable at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartRecoveryStore(NodeManager.java:178) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:220) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:537) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:585) Caused by: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: lock /hdp/logs/hadoop-yarn/nodemanager/recovery-state/yarn-nm-state/LOCK: Resource temporarily unavailable at org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200) at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218) at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168) at org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.initStorage(NMLeveldbStateStoreService.java:930) at org.apache.hadoop.yarn.server.nodemanager.recovery.NMStateStoreService.serviceInit(NMStateStoreService.java:204) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) ... 5 more 2016-01-26 15:01:26,317 FATAL nodemanager.NodeManager (NodeManager.java:initAndStartNodeManager(540)) - Error starting NodeManager org.apache.hadoop.service.ServiceStateException: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: lock /hdp/logs/hadoop-yarn/nodemanager/recovery-state/yarn-nm-state/LOCK: Resource temporarily unavailable at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartRecoveryStore(NodeManager.java:178) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:220) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:537) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:585) Caused by: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: lock /hdp/logs/hadoop-yarn/nodemanager/recovery-state/yarn-nm-state/LOCK: Resource temporarily unavailable at org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200) at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218) at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168) at org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.initStorage(NMLeveldbStateStoreService.java:930) at org.apache.hadoop.yarn.server.nodemanager.recovery.NMStateStoreService.serviceInit(NMStateStoreService.java:204) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) ... 5 more 2016-01-26 15:01:26,319 INFO nodemanager.NodeManager (LogAdapter.java:info(45)) - SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NodeManager at bvluxhdpdn05.conocophillips.net/158.139.121.115 ************************************************************/

"

As we can see it is not complaining about LOCK file bot present but unavailable, as whichever NM starts first acquire this LOCK (remember this is a single mount point and not local file-system)

If I change the log location back to local file-system even for example /tmp/yarnlogs its works smooth since all the NM get access to LOCK file on local file-system where ever they are installed.

Has someone faces this issue and can you please suggest a fix to this.

Thanks Mayank

1 ACCEPTED SOLUTION

avatar
Guru

Hi @mkataria ,

did I understand that correct, do all the Nodemanagers have kind of network storage mounted and want to write to /hdp/logs/hadoop-yarn/nodemanager/recovery-state/yarn-nm-state ?

This won't work since every NM wants to keep his own state in dir ..../yarn-nm-state, therefore just one NM can create the LOCK file there (besides the files keeping the state).

Logging to a central directory for that cases is difficult.

One solution could be to put each NM in a different config group, and specify the log directory for each config group, e.g.

/hdp/logs/hadoop-yarn/nm1

/hdp/logs/hadoop-yarn/nm2

...

Regards, Gerd

View solution in original post

5 REPLIES 5

avatar
Guru

Hi @mkataria ,

did I understand that correct, do all the Nodemanagers have kind of network storage mounted and want to write to /hdp/logs/hadoop-yarn/nodemanager/recovery-state/yarn-nm-state ?

This won't work since every NM wants to keep his own state in dir ..../yarn-nm-state, therefore just one NM can create the LOCK file there (besides the files keeping the state).

Logging to a central directory for that cases is difficult.

One solution could be to put each NM in a different config group, and specify the log directory for each config group, e.g.

/hdp/logs/hadoop-yarn/nm1

/hdp/logs/hadoop-yarn/nm2

...

Regards, Gerd

avatar
Expert Contributor

@Gerd Koenig I had same doubts thanks for confirming, can you share something on putting NM to diff, config groups at your leisure.

avatar
Guru

Hi @mkataria ,

sure, I'll try my best.

First click on service 'HDFS' in Ambari, then

1596-manage-config-groups.png

In the next dialog, create one config-group per Nodemanager , provide a corresponding name and assign that node to that config group

1597-add-config-group.png

Then get back to the "general" HDFS config page (picture 1), select a config group and adjust the log destination for that particular Nodemanager-node (==config-group).

...and restart HDFS 😉

Regards, Gerd

avatar
Master Mentor

@mkataria check the disk on the nodes, permissions, mount options, space, etc.

avatar
Expert Contributor

@Artem Ervits checked all of those and does not seems to be an issue