Support Questions

Find answers, ask questions, and share your expertise

hdfs-datanode can't connect to hdfs-namenode

avatar
Contributor

Hi, after restarting the pod for hdfs-datanode and hdfs-namenode, hdfs-datanode 0,1,2 are not connecting to namenode.

 

 

2023-03-30 10:45:18,285 DEBUG ipc.Client: Connecting to hdfs-namenode-0.hdfs-namenode/10.128.66.125:9820 2023-03-30 10:45:19,287 INFO ipc.Client: Retrying connect to server: hdfs-namenode-0.hdfs-namenode/10.128.66.125:9820. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries =10, sleepTime=1000 MILLISECONDS) 2023-03-30 10:45:20,289 INFO ipc.Client: Retrying connect to server: hdfs-namenode-0.hdfs-namenode/10.128.66.125:9820. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries =10, sleepTime=1000 MILLISECONDS) 2023-03-30 10:45:21,291 INFO ipc.Client: Retrying connect to server: hdfs-namenode-0.hdfs-namenode/10.128.66.125:9820. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries =10, sleepTime=1000 MILLISECONDS) 2023-03-30 10:45:22,293 INFO ipc.Client: Retrying connect to server: hdfs-namenode-0.hdfs-namenode/10.128.66.125:9820. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries =10, sleepTime=1000 MILLISECONDS) 2023-03-30 10:45:23,295 INFO ipc.Client: Retrying connect to server: hdfs-namenode-0.hdfs-namenode/10.128.66.125:9820. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2023-03-30 10:45:24,299 INFO ipc.Client: Retrying connect to server: hdfs-namenode-0.hdfs-namenode/10.128.66.125:9820. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2023-03-30 10:45:25,301 INFO ipc.Client: Retrying connect to server: hdfs-namenode-0.hdfs-namenode/10.128.66.125:9820. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2023-03-30 10:45:26,305 INFO ipc.Client: Retrying connect to server: hdfs-namenode-0.hdfs-namenode/10.128.66.125:9820. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2023-03-30 10:45:27,307 INFO ipc.Client: Retrying connect to server: hdfs-namenode-0.hdfs-namenode/10.128.66.125:9820. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2023-03-30 10:45:27,546 DEBUG ipc.Server: IPC Server idle connection scanner for port 9867: task running 2023-03-30 10:45:28,318 INFO ipc.Client: Retrying connect to server: hdfs-namenode-0.hdfs-namenode/10.128.66.125:9820. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2023-03-30 10:45:28,320 DEBUG ipc.Client: Failed to connect to server: hdfs-namenode-0.hdfs-namenode/10.128.66.125:9820: retries get failed due to exceeded maximum allowed retries number: 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:687) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:790) at org.apache.hadoop.ipc.Client$Connection.access$3600(Client.java:410) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1558) at org.apache.hadoop.ipc.Client.call(Client.java:1389) at org.apache.hadoop.ipc.Client.call(Client.java:1353) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116) at com.sun.proxy.$Proxy21.versionRequest(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.versionRequest(DatanodeProtocolClientSideTranslatorPB.java:287) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.retrieveNamespaceInfo(BPServiceActor.java:229) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:275) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:816) at java.lang.Thread.run(Thread.java:748) 2023-03-30 10:45:28,321 DEBUG ipc.Client: closing ipc connection to hdfs-namenode-0.hdfs-namenode/10.128.66.125:9820: Connection refused java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

here are the logs of hdfs-datanode.

Thank you

2 REPLIES 2

avatar
Community Manager

@Noel_0317 Welcome to our community! To help you get the best possible answer, I have tagged in our HDFS experts @rki_ @mszurap @willx @Asok  who may be able to assist you further.

Please feel free to provide any additional information or details about your query, and we hope that you will find a satisfactory solution to your question.



Regards,

Vidya Sargur,
Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:

avatar
Rising Star

Can you isolate any connection issues between your NN and DN pods? Maybe you can try doing an nc or telnet to the NN port from the DN pod?