Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

IO error: lock /hadoop/yarn/local/registeredExecutors.ldb/LOCK: already held by process

avatar
Expert Contributor

In HDP 2.6.1 cluster, node manager is going down with the below error.

2017-11-09 19:55:06,044 WARN  containermanager.AuxServices (AuxServices.java:serviceInit(149)) - The Auxilurary Service named 'spark2_shuffle' in the configuration is for class org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxiliaryServiceWithCustomClassLoader which has a name of 'org.apache.spark.network.yarn.YarnShuffleService with custom class loader'. Because these are not the same tools trying to send ServiceData and read Service Meta Data may have issues unless the refer to the name in the config.
2017-11-09 19:55:06,044 INFO  containermanager.AuxServices (AuxServices.java:addService(72)) - Adding auxiliary service org.apache.spark.network.yarn.YarnShuffleService with custom class loader, "spark2_shuffle"
2017-11-09 19:55:06,046 ERROR shuffle.ExternalShuffleBlockResolver (ExternalShuffleBlockResolver.java:<init>(113)) - error opening leveldb file /hadoop/yarn/local/registeredExecutors.ldb.  Creating new file, will not be able to recover state for existing applications
org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: lock /hadoop/yarn/local/registeredExecutors.ldb/LOCK: already held by process
        at org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200)
        at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218)
        at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168)
        at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.<init>(ExternalShuffleBlockResolver.java:100)
        at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.<init>(ExternalShuffleBlockResolver.java:81)
        at org.apache.spark.network.shuffle.ExternalShuffleBlockHandler.<init>(ExternalShuffleBlockHandler.java:56)
        at org.apache.spark.network.yarn.YarnShuffleService.serviceInit(YarnShuffleService.java:128)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxiliaryServiceWithCustomClassLoader.serviceInit(AuxiliaryServiceWithCustomClassLoader.java:65)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:162)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:245)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:291)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:546)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:594)
2017-11-09 19:55:06,047 ERROR yarn.YarnShuffleService (YarnShuffleService.java:serviceInit(130)) - Failed to initialize external shuffle service
java.io.IOException: Unable to create state store
        at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.<init>(ExternalShuffleBlockResolver.java:129)
        at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.<init>(ExternalShuffleBlockResolver.java:81)
        at org.apache.spark.network.shuffle.ExternalShuffleBlockHandler.<init>(ExternalShuffleBlockHandler.java:56)
        at org.apache.spark.network.yarn.YarnShuffleService.serviceInit(YarnShuffleService.java:128)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxiliaryServiceWithCustomClassLoader.serviceInit(AuxiliaryServiceWithCustomClassLoader.java:65)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:162)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:245)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:291)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:546)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:594)
Caused by: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: lock /hadoop/yarn/local/registeredExecutors.ldb/LOCK: already held by process
        at org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200)
        at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218)
        at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168)
        at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.<init>(ExternalShuffleBlockResolver.java:127)
        ... 16 more
2017-11-09 19:55:06,057 INFO  service.AbstractService (AbstractService.java:noteFailure(272)) - Service spark_shuffle failed in state INITED; cause: java.net.BindException: Address already in use
java.net.BindException: Address already in use
        at sun.nio.ch.Net.bind0(Native Method)
        at sun.nio.ch.Net.bind(Unknown Source)
        at sun.nio.ch.Net.bind(Unknown Source)
        at sun.nio.ch.ServerSocketChannelImpl.bind(Unknown Source)
        at sun.nio.ch.ServerSocketAdaptor.bind(Unknown Source)
        at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125)
        at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:475)
        at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1021)
        at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:455)
        at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:440)
        at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:844)
        at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:194)
        at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:340)
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
        at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
        at java.lang.Thread.run(Unknown Source)
2017-11-09 19:55:06,059 INFO  service.AbstractService (AbstractService.java:noteFailure(272)) - Service org.apache.spark.network.yarn.YarnShuffleService with custom class loader failed in state INITED; cause: org.apache.hadoop.service.ServiceStateException: java.net.BindException: Address already in use
org.apache.hadoop.service.ServiceStateException: java.net.BindException: Address already in use
        at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxiliaryServiceWithCustomClassLoader.serviceInit(AuxiliaryServiceWithCustomClassLoader.java:65)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:162)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:245)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:291)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:546)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:594)
Caused by: java.net.BindException: Address already in use
        at sun.nio.ch.Net.bind0(Native Method)
        at sun.nio.ch.Net.bind(Unknown Source)
        at sun.nio.ch.Net.bind(Unknown Source)
        at sun.nio.ch.ServerSocketChannelImpl.bind(Unknown Source)
        at sun.nio.ch.ServerSocketAdaptor.bind(Unknown Source)
        at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125)
        at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:475)
        at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1021)
        at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:455)
        at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:440)
        at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:844)
        at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:194)
        at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:340)
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
        at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
        at java.lang.Thread.run(Unknown Source)
2017-11-09 19:55:06,059 FATAL containermanager.AuxServices (AuxServices.java:serviceInit(164)) - Failed to initialize spark2_shuffle
org.apache.hadoop.service.ServiceStateException: java.net.BindException: Address already in use
        at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxiliaryServiceWithCustomClassLoader.serviceInit(AuxiliaryServiceWithCustomClassLoader.java:65)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:162)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:245)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:291)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:546)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:594)
Caused by: java.net.BindException: Address already in use
        at sun.nio.ch.Net.bind0(Native Method)
        at sun.nio.ch.Net.bind(Unknown Source)
        at sun.nio.ch.Net.bind(Unknown Source)
        at sun.nio.ch.ServerSocketChannelImpl.bind(Unknown Source)
        at sun.nio.ch.ServerSocketAdaptor.bind(Unknown Source)
        at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125)
        at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:475)
        at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1021)
        at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:455)
        at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:440)
        at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:844)
        at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:194)
        at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:340)
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
        at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
        at java.lang.Thread.run(Unknown Source)
2017-11-09 19:55:06,060 INFO  service.AbstractService (AbstractService.java:noteFailure(272)) - Service org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices failed in state INITED; cause: org.apache.hadoop.service.ServiceStateException: java.net.BindException: Address already in use
org.apache.hadoop.service.ServiceStateException: java.net.BindException: Address already in use
        at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxiliaryServiceWithCustomClassLoader.serviceInit(AuxiliaryServiceWithCustomClassLoader.java:65)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:162)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:245)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:291)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:546)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:594)
Caused by: java.net.BindException: Address already in use
        at sun.nio.ch.Net.bind0(Native Method)
        at sun.nio.ch.Net.bind(Unknown Source)
        at sun.nio.ch.Net.bind(Unknown Source)
        at sun.nio.ch.ServerSocketChannelImpl.bind(Unknown Source)
        at sun.nio.ch.ServerSocketAdaptor.bind(Unknown Source)
        at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125)
        at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:475)
        at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1021)
        at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:455)
        at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:440)
        at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:844)
        at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:194)
        at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:340)
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
        at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
        at java.lang.Thread.run(Unknown Source)
2017-11-09 19:55:06,060 INFO  service.AbstractService (AbstractService.java:noteFailure(272)) - Service org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl failed in state INITED; cause: org.apache.hadoop.service.ServiceStateException: java.net.BindException: Address already in use
org.apache.hadoop.service.ServiceStateException: java.net.BindException: Address already in use
        at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxiliaryServiceWithCustomClassLoader.serviceInit(AuxiliaryServiceWithCustomClassLoader.java:65)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:162)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:245)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:291)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:546)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:594)
Caused by: java.net.BindException: Address already in use
        at sun.nio.ch.Net.bind0(Native Method)
        at sun.nio.ch.Net.bind(Unknown Source)
        at sun.nio.ch.Net.bind(Unknown Source)
        at sun.nio.ch.ServerSocketChannelImpl.bind(Unknown Source)
        at sun.nio.ch.ServerSocketAdaptor.bind(Unknown Source)
        at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125)
        at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:475)
        at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1021)
        at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:455)
        at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:440)
        at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:844)
        at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:194)
        at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:340)
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
        at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
        at java.lang.Thread.run(Unknown Source)
2017-11-09 19:55:06,061 INFO  service.AbstractService (AbstractService.java:noteFailure(272)) - Service NodeManager failed in state INITED; cause: org.apache.hadoop.service.ServiceStateException: java.net.BindException: Address already in use
org.apache.hadoop.service.ServiceStateException: java.net.BindException: Address already in use
        at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxiliaryServiceWithCustomClassLoader.serviceInit(AuxiliaryServiceWithCustomClassLoader.java:65)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:162)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:245)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:291)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:546)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:594)
Caused by: java.net.BindException: Address already in use
        at sun.nio.ch.Net.bind0(Native Method)
        at sun.nio.ch.Net.bind(Unknown Source)
        at sun.nio.ch.Net.bind(Unknown Source)
        at sun.nio.ch.ServerSocketChannelImpl.bind(Unknown Source)
        at sun.nio.ch.ServerSocketAdaptor.bind(Unknown Source)
        at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125)
        at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:475)
        at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1021)
        at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:455)
        at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:440)
        at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:844)
        at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:194)
        at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:340)
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
        at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
        at java.lang.Thread.run(Unknown Source)
2017-11-09 19:55:06,061 INFO  impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(211)) - Stopping NodeManager metrics system...
2017-11-09 19:55:06,062 INFO  impl.MetricsSinkAdapter (MetricsSinkAdapter.java:publishMetricsFromQueue(141)) - timeline thread interrupted.
2017-11-09 19:55:06,063 INFO  impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(217)) - NodeManager metrics system stopped.
2017-11-09 19:55:06,063 INFO  impl.MetricsSystemImpl (MetricsSystemImpl.java:shutdown(606)) - NodeManager metrics system shutdown complete.
2017-11-09 19:55:06,064 FATAL nodemanager.NodeManager (NodeManager.java:initAndStartNodeManager(549)) - Error starting NodeManager
org.apache.hadoop.service.ServiceStateException: java.net.BindException: Address already in use
        at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxiliaryServiceWithCustomClassLoader.serviceInit(AuxiliaryServiceWithCustomClassLoader.java:65)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:162)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:245)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:291)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:546)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:594)
Caused by: java.net.BindException: Address already in use
        at sun.nio.ch.Net.bind0(Native Method)
        at sun.nio.ch.Net.bind(Unknown Source)
        at sun.nio.ch.Net.bind(Unknown Source)
        at sun.nio.ch.ServerSocketChannelImpl.bind(Unknown Source)
        at sun.nio.ch.ServerSocketAdaptor.bind(Unknown Source)
        at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125)
        at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:475)
        at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1021)
        at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:455)
        at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:440)
        at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:844)
        at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:194)
        at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:340)
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
        at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
        at java.lang.Thread.run(Unknown Source)
2017-11-09 19:55:06,065 INFO  timeline.HadoopTimelineMetricsSink (AbstractTimelineMetricsSink.java:getCurrentCollectorHost(232)) - No live collector to send metrics to. Metrics to be sent will be discarded. This message will be skipped for the next 20 times.
2017-11-09 19:55:06,065 INFO  nodemanager.NodeManager (LogAdapter.java:info(45)) - SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NodeManager at slave4.hdp.com/

I have checked node manager port-8042(netstat -nap | grep 8042) in the machine and there is no process running for the same.

Could you please help me on this.


nodemanager-error.png
4 REPLIES 4

avatar

Please run

fuser /hadoop/yarn/local/registeredExecutors.ldb/LOCK. to know which PID is holding to the lock.

ps -eaf | grep pid will give you the process holding to it

avatar
Expert Contributor

@kgautam : Thanks for your inputs. fuser /hadoop/yarn/local/registeredExecutors.ldb/LOCK did not help. It is not showing any PID associated with LOCK.

avatar
Explorer

Any other suggestions? I am getting the same error.

avatar
Community Manager

@P_ As this is an older post you would have a better chance of receiving a resolution by starting a new thread. This will also provide the opportunity to provide details specific to your environment that could aid others in providing a more accurate answer to your question. 


Cy Jervis, Manager, Community Program
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.