Created 11-09-2017 11:51 AM
In HDP 2.6.1 cluster, node manager is going down with the below error.
2017-11-09 19:55:06,044 WARN containermanager.AuxServices (AuxServices.java:serviceInit(149)) - The Auxilurary Service named 'spark2_shuffle' in the configuration is for class org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxiliaryServiceWithCustomClassLoader which has a name of 'org.apache.spark.network.yarn.YarnShuffleService with custom class loader'. Because these are not the same tools trying to send ServiceData and read Service Meta Data may have issues unless the refer to the name in the config. 2017-11-09 19:55:06,044 INFO containermanager.AuxServices (AuxServices.java:addService(72)) - Adding auxiliary service org.apache.spark.network.yarn.YarnShuffleService with custom class loader, "spark2_shuffle" 2017-11-09 19:55:06,046 ERROR shuffle.ExternalShuffleBlockResolver (ExternalShuffleBlockResolver.java:<init>(113)) - error opening leveldb file /hadoop/yarn/local/registeredExecutors.ldb. Creating new file, will not be able to recover state for existing applications org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: lock /hadoop/yarn/local/registeredExecutors.ldb/LOCK: already held by process at org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200) at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218) at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168) at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.<init>(ExternalShuffleBlockResolver.java:100) at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.<init>(ExternalShuffleBlockResolver.java:81) at org.apache.spark.network.shuffle.ExternalShuffleBlockHandler.<init>(ExternalShuffleBlockHandler.java:56) at org.apache.spark.network.yarn.YarnShuffleService.serviceInit(YarnShuffleService.java:128) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxiliaryServiceWithCustomClassLoader.serviceInit(AuxiliaryServiceWithCustomClassLoader.java:65) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:162) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:245) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:291) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:546) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:594) 2017-11-09 19:55:06,047 ERROR yarn.YarnShuffleService (YarnShuffleService.java:serviceInit(130)) - Failed to initialize external shuffle service java.io.IOException: Unable to create state store at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.<init>(ExternalShuffleBlockResolver.java:129) at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.<init>(ExternalShuffleBlockResolver.java:81) at org.apache.spark.network.shuffle.ExternalShuffleBlockHandler.<init>(ExternalShuffleBlockHandler.java:56) at org.apache.spark.network.yarn.YarnShuffleService.serviceInit(YarnShuffleService.java:128) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxiliaryServiceWithCustomClassLoader.serviceInit(AuxiliaryServiceWithCustomClassLoader.java:65) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:162) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:245) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:291) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:546) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:594) Caused by: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: lock /hadoop/yarn/local/registeredExecutors.ldb/LOCK: already held by process at org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200) at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218) at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168) at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.<init>(ExternalShuffleBlockResolver.java:127) ... 16 more 2017-11-09 19:55:06,057 INFO service.AbstractService (AbstractService.java:noteFailure(272)) - Service spark_shuffle failed in state INITED; cause: java.net.BindException: Address already in use java.net.BindException: Address already in use at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Unknown Source) at sun.nio.ch.Net.bind(Unknown Source) at sun.nio.ch.ServerSocketChannelImpl.bind(Unknown Source) at sun.nio.ch.ServerSocketAdaptor.bind(Unknown Source) at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125) at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:475) at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1021) at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:455) at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:440) at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:844) at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:194) at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:340) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) at java.lang.Thread.run(Unknown Source) 2017-11-09 19:55:06,059 INFO service.AbstractService (AbstractService.java:noteFailure(272)) - Service org.apache.spark.network.yarn.YarnShuffleService with custom class loader failed in state INITED; cause: org.apache.hadoop.service.ServiceStateException: java.net.BindException: Address already in use org.apache.hadoop.service.ServiceStateException: java.net.BindException: Address already in use at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172) at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxiliaryServiceWithCustomClassLoader.serviceInit(AuxiliaryServiceWithCustomClassLoader.java:65) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:162) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:245) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:291) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:546) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:594) Caused by: java.net.BindException: Address already in use at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Unknown Source) at sun.nio.ch.Net.bind(Unknown Source) at sun.nio.ch.ServerSocketChannelImpl.bind(Unknown Source) at sun.nio.ch.ServerSocketAdaptor.bind(Unknown Source) at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125) at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:475) at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1021) at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:455) at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:440) at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:844) at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:194) at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:340) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) at java.lang.Thread.run(Unknown Source) 2017-11-09 19:55:06,059 FATAL containermanager.AuxServices (AuxServices.java:serviceInit(164)) - Failed to initialize spark2_shuffle org.apache.hadoop.service.ServiceStateException: java.net.BindException: Address already in use at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172) at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxiliaryServiceWithCustomClassLoader.serviceInit(AuxiliaryServiceWithCustomClassLoader.java:65) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:162) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:245) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:291) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:546) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:594) Caused by: java.net.BindException: Address already in use at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Unknown Source) at sun.nio.ch.Net.bind(Unknown Source) at sun.nio.ch.ServerSocketChannelImpl.bind(Unknown Source) at sun.nio.ch.ServerSocketAdaptor.bind(Unknown Source) at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125) at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:475) at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1021) at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:455) at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:440) at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:844) at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:194) at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:340) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) at java.lang.Thread.run(Unknown Source) 2017-11-09 19:55:06,060 INFO service.AbstractService (AbstractService.java:noteFailure(272)) - Service org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices failed in state INITED; cause: org.apache.hadoop.service.ServiceStateException: java.net.BindException: Address already in use org.apache.hadoop.service.ServiceStateException: java.net.BindException: Address already in use at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172) at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxiliaryServiceWithCustomClassLoader.serviceInit(AuxiliaryServiceWithCustomClassLoader.java:65) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:162) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:245) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:291) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:546) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:594) Caused by: java.net.BindException: Address already in use at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Unknown Source) at sun.nio.ch.Net.bind(Unknown Source) at sun.nio.ch.ServerSocketChannelImpl.bind(Unknown Source) at sun.nio.ch.ServerSocketAdaptor.bind(Unknown Source) at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125) at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:475) at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1021) at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:455) at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:440) at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:844) at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:194) at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:340) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) at java.lang.Thread.run(Unknown Source) 2017-11-09 19:55:06,060 INFO service.AbstractService (AbstractService.java:noteFailure(272)) - Service org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl failed in state INITED; cause: org.apache.hadoop.service.ServiceStateException: java.net.BindException: Address already in use org.apache.hadoop.service.ServiceStateException: java.net.BindException: Address already in use at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172) at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxiliaryServiceWithCustomClassLoader.serviceInit(AuxiliaryServiceWithCustomClassLoader.java:65) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:162) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:245) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:291) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:546) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:594) Caused by: java.net.BindException: Address already in use at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Unknown Source) at sun.nio.ch.Net.bind(Unknown Source) at sun.nio.ch.ServerSocketChannelImpl.bind(Unknown Source) at sun.nio.ch.ServerSocketAdaptor.bind(Unknown Source) at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125) at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:475) at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1021) at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:455) at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:440) at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:844) at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:194) at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:340) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) at java.lang.Thread.run(Unknown Source) 2017-11-09 19:55:06,061 INFO service.AbstractService (AbstractService.java:noteFailure(272)) - Service NodeManager failed in state INITED; cause: org.apache.hadoop.service.ServiceStateException: java.net.BindException: Address already in use org.apache.hadoop.service.ServiceStateException: java.net.BindException: Address already in use at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172) at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxiliaryServiceWithCustomClassLoader.serviceInit(AuxiliaryServiceWithCustomClassLoader.java:65) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:162) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:245) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:291) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:546) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:594) Caused by: java.net.BindException: Address already in use at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Unknown Source) at sun.nio.ch.Net.bind(Unknown Source) at sun.nio.ch.ServerSocketChannelImpl.bind(Unknown Source) at sun.nio.ch.ServerSocketAdaptor.bind(Unknown Source) at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125) at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:475) at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1021) at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:455) at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:440) at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:844) at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:194) at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:340) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) at java.lang.Thread.run(Unknown Source) 2017-11-09 19:55:06,061 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(211)) - Stopping NodeManager metrics system... 2017-11-09 19:55:06,062 INFO impl.MetricsSinkAdapter (MetricsSinkAdapter.java:publishMetricsFromQueue(141)) - timeline thread interrupted. 2017-11-09 19:55:06,063 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(217)) - NodeManager metrics system stopped. 2017-11-09 19:55:06,063 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:shutdown(606)) - NodeManager metrics system shutdown complete. 2017-11-09 19:55:06,064 FATAL nodemanager.NodeManager (NodeManager.java:initAndStartNodeManager(549)) - Error starting NodeManager org.apache.hadoop.service.ServiceStateException: java.net.BindException: Address already in use at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172) at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxiliaryServiceWithCustomClassLoader.serviceInit(AuxiliaryServiceWithCustomClassLoader.java:65) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:162) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:245) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:291) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:546) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:594) Caused by: java.net.BindException: Address already in use at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Unknown Source) at sun.nio.ch.Net.bind(Unknown Source) at sun.nio.ch.ServerSocketChannelImpl.bind(Unknown Source) at sun.nio.ch.ServerSocketAdaptor.bind(Unknown Source) at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125) at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:475) at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1021) at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:455) at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:440) at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:844) at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:194) at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:340) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) at java.lang.Thread.run(Unknown Source) 2017-11-09 19:55:06,065 INFO timeline.HadoopTimelineMetricsSink (AbstractTimelineMetricsSink.java:getCurrentCollectorHost(232)) - No live collector to send metrics to. Metrics to be sent will be discarded. This message will be skipped for the next 20 times. 2017-11-09 19:55:06,065 INFO nodemanager.NodeManager (LogAdapter.java:info(45)) - SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NodeManager at slave4.hdp.com/
I have checked node manager port-8042(netstat -nap | grep 8042) in the machine and there is no process running for the same.
Could you please help me on this.
Created 11-09-2017 12:39 PM
Please run
fuser /hadoop/yarn/local/registeredExecutors.ldb/LOCK. to know which PID is holding to the lock.
ps -eaf | grep pid will give you the process holding to it
Created 11-10-2017 03:17 PM
@kgautam : Thanks for your inputs. fuser /hadoop/yarn/local/registeredExecutors.ldb/LOCK did not help. It is not showing any PID associated with LOCK.
Created 02-04-2020 02:40 AM
Any other suggestions? I am getting the same error.
Created 02-04-2020 06:05 AM
@P_ As this is an older post you would have a better chance of receiving a resolution by starting a new thread. This will also provide the opportunity to provide details specific to your environment that could aid others in providing a more accurate answer to your question.