Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Datanode Starts Running but Does Not Go Live.

Highlighted

Datanode Starts Running but Does Not Go Live.

Explorer

I have a 6 node Centos 7 cluster with 4 datanodes. I have all the datanodes up and running but the dashboard shows only 3/4 datanodes live. I looked at the logs at /var/log/hadoop/hdfs/hadoop-hdfs-datanode-<data_node>.log and it says:

2017-09-14 11:57:04,794 INFO  web.DatanodeHttpServer (SimpleHttpProxyHandler.java:exceptionCaught(147)) - Proxy for / failed. cause:
java.io.IOException: Connection reset by peer
        at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
        at sun.nio.ch.IOUtil.read(IOUtil.java:192)
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
        at io.netty.buffer.UnpooledUnsafeDirectByteBuf.setBytes(UnpooledUnsafeDirectByteBuf.java:447)
        at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)
        at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242)
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
        at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
        at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
        at java.lang.Thread.run(Thread.java:745)

Not sure what this means.

I tried restarting ambari-agent, rebooting the machine itself, restarting ambari-server on namenode. Can someone suggest where else I should look?

EDIT: Also, I tried pinging the name node from this particular datanode on that particular port it is listening to (8020- standard port for Hadoop) and it connects. I can see the connection from both, datanode and namenode. I don't understand why the communication is not happening.

3 REPLIES 3
Highlighted

Re: Datanode Starts Running but Does Not Go Live.

Contributor

@Sree Kupp The above just means the client disconnect after finish writing. The issue has been fixed in latest HDP 2.5. Can you check Namenode UI live nodes ? are they 4 ?

Re: Datanode Starts Running but Does Not Go Live.

Explorer

@Vinod Thanks for your reply. Yes, it shows 4 live data nodes. I am surprised how this happened. I have been working on it the whole day yesterday and nothing really happened. Today morning I just restarted the data node process for the failed data node and I can all data nodes are live today. Can you explain this please?

Highlighted

Re: Datanode Starts Running but Does Not Go Live.

Contributor

@Sree Kupp hard to explain without datanode and namenode logs. Check the datanode logs.

Don't have an account?
Coming from Hortonworks? Activate your account here