Support Questions

Find answers, ask questions, and share your expertise

NameNode High Availability Health, node on a Unknown state

avatar
Contributor
Hi,
We are getting the following alert from ambari notifications .
NameNode High Availability Health
Active['xx-xxx-x1-xx02.xxxxx.xx:50470'], Standby[], Unknown['xx-xxx-x1-xx01.xxxxx.xx:50470']
 
Could you please help us solve this issue?
 
Thank you
1 ACCEPTED SOLUTION

avatar
Master Mentor

@Koffi 
This is typical of a rogue process hasn't reslease the

Caused by: java.net.BindException: Address already in use

You will need to run 

# kill -9 5356

The restart the NN that should resolve the issue 

View solution in original post

2 REPLIES 2

avatar
Contributor

We tried to restart the namenode with the Unknown state without success. We are getting the following error in the logs:

2022-01-16 06:46:13,099 ERROR namenode.NameNode (NameNode.java:main(1715)) - Failed to start namenode.
java.net.BindException: Port in use: xx-xxx-x2-xx01.xxxxx.xx:50470
        at org.apache.hadoop.http.HttpServer2.constructBindException(HttpServer2.java:1197)
        at org.apache.hadoop.http.HttpServer2.bindForSinglePort(HttpServer2.java:1219)
        at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:1278)
        at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:1133)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:177)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:869)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:691)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:937)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:910)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710)
Caused by: java.net.BindException: Address already in use
        at sun.nio.ch.Net.bind0(Native Method)
        at sun.nio.ch.Net.bind(Net.java:433)
        at sun.nio.ch.Net.bind(Net.java:425)
        at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
        at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
        at org.eclipse.jetty.server.ServerConnector.open(ServerConnector.java:317)
        at org.apache.hadoop.http.HttpServer2.bindListener(HttpServer2.java:1184)
        at org.apache.hadoop.http.HttpServer2.bindForSinglePort(HttpServer2.java:1215)
        ... 9 more
2022-01-16 06:46:13,101 INFO  util.ExitUtil (ExitUtil.java:terminate(210)) - Exiting with status 1: java.net.BindException: Port in use: xx-xxx-x2-xx01.xxxxx.xx:50470
2022-01-16 06:46:13,102 INFO  namenode.NameNode (LogAdapter.java:info(51)) - SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at xx-xxx-x2-xx01.xxxxx.xx/xx.x.xx.xx

 

We runned th command lsof -i:50470 on the host to see what process is running on this port we got the following result:

java 5356 hdfs 354u IPv4 4207102955 0t0 TCP xx-xxx-x2-xx01.xxxxx.xx:56788->xx-xxx-x2-xx01.xxxxx.xx:50470 (ESTABLISHED)

 

Do we need to kill that process by running kill 5356 and than try to start the namenode again?

avatar
Master Mentor

@Koffi 
This is typical of a rogue process hasn't reslease the

Caused by: java.net.BindException: Address already in use

You will need to run 

# kill -9 5356

The restart the NN that should resolve the issue