Created 06-03-2016 12:03 PM
I'm using Ambari 2.2.1.0 and HDP 2.3.4.0
Using Ambari UI, I tried to start my NN. But, its showing below error.
safemode: Call From server1.ddns.net/141.178.0.16 to server1.ddns.net:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused 2016-06-03 12:40:06,640 - Retrying after 10 seconds. Reason: Execution of '/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs hdfs://server1.ddns.net:8020 -safemode get | grep 'Safe mode is OFF'' returned 1. safemode: Call From server1.ddns.net/141.178.0.16 to server1.ddns.net:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
After that, I tried below commands and the outputs are:
[root@server1 ~]# sudo -u hdfs hdfs dfsadmin -report report: Call From server1.ddns.net/141.178.0.16 to server1.ddns.net:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused [root@server1 ~]# sudo -u hdfs hdfs dfsadmin -safemode enter safemode: Call From server1.ddns.net/141.178.0.16 to server1.ddns.net:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused [root@server1 ~]# sudo -u hdfs hdfs dfsadmin -safemode leave safemode: Call From server1.ddns.net/141.178.0.16 to server1.ddns.net:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused [root@server1 ~]#
1) Why is it not leaving from safemode?
2) What could be the solution to resolve my problem and make the NN up?
Please help me in this regard....
Created 06-05-2016 10:24 AM
Ambari 2.2.1.0 and HDP 2.3.4.0
and this issue got resolved with below steps
1) stopped ambari agent and ambari server
2) did NN format using --> hdfs namenode -format
3) provided 777 rights and changed the user to hadoop, for namenode folder resided in the below path
/bigdata/hadoop/hdfs/namenode
4) restarted the agent and server.
Created 06-03-2016 01:54 PM
You need to first find out why the connection to the Namenode is being refused. Is the Namenode process up? You can do a quick check with ps aux | grep -i namenode
If the Namenode process is up, then look at the logs in /var/log/hadoop/hdfs. You will want to look at the file with that looks like hadoop-hdfs-namenode-*.log. This should help you narrow down the cause a bit.
Created 06-03-2016 06:58 PM
Thanks for the reply.
Below are the results of each one you asked to execute.
1)
[root@server1 hdfs]# ps aux | grep -i namenode root 27093 0.0 0.0 103308 900 pts/0 S+ 19:40 0:00 grep -i namenode
2) [root@server1 hdfs]# nano hadoop-hdfs-namenode-server1.ddns.net.log
2016-06-03 14:03:11,218 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(211)) - Stopping NameNode metrics system... 2016-06-03 14:03:11,219 INFO impl.MetricsSinkAdapter (MetricsSinkAdapter.java:publishMetricsFromQueue(141)) - timeline thre$ 2016-06-03 14:03:11,219 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(217)) - NameNode metrics system stopped. 2016-06-03 14:03:11,219 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:shutdown(607)) - NameNode metrics system shutdo$ 2016-06-03 14:03:11,220 ERROR namenode.NameNode (NameNode.java:main(1712)) - Failed to start namenode. java.net.BindException: Port in use: server1.ddns.net:50070 at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:919) at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:856) at org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:142) at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:892) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:716) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:951) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:935) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1641) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1707) Caused by: java.net.BindException: Address already in use at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:444) at sun.nio.ch.Net.bind(Net.java:436) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:914) ... 8 more 2016-06-03 14:03:11,223 INFO util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1 2016-06-03 14:03:11,260 INFO namenode.NameNode (LogAdapter.java:info(47)) - SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at server1.ddns.net/141.178.0.16
In my core-site.xml, below is the entry
<property> <name>fs.defaultFS</name> <value>hdfs://server1.ddns.net:8020</value> </property>
In my hdfs-site.xml, below are the entry
<property> <name>dfs.https.port</name> <value>50470</value> </property> <property> <name>dfs.namenode.http-address</name> <value>server1.ddns.net:50070</value> </property> <property> <name>dfs.namenode.https-address</name> <value>server1.ddns.net:50470</value> </property> <property> <name>dfs.namenode.rpc-address</name> <value>server1.ddns.net:8020</value> </property> <property> <name>dfs.namenode.safemode.threshold-pct</name> <value>1</value> </property>
Created 06-03-2016 07:08 PM
Looks like your namenode process is down due to "Port in use: server1.ddns.net:50070".
Check if any process occupying the port 50070.
lsof -i:50070
If output shows nothing then can you please start the NN again and run the above command to see if NN process running on that port?.
Created 06-03-2016 07:15 PM
Even after re run the NN, nothing is getting displayed
[root@server1 conf]# lsof -i:50070 [root@server1 conf]#
Created 06-03-2016 07:20 PM
Can you check latest NN logs?
Created 06-05-2016 12:04 AM
Latest NN logs:
2016-06-05 01:03:14,805 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(217)) - NameNode metrics system stopped. 2016-06-05 01:03:14,806 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:shutdown(607)) - NameNode metrics system shutdo$ 2016-06-05 01:03:14,806 ERROR namenode.NameNode (NameNode.java:main(1712)) - Failed to start namenode. java.net.BindException: Port in use: server1.ddns.net:50070 at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:919) at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:856) at org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:142) at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:892) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:716) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:951) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:935) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1641) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1707) Caused by: java.net.BindException: Address already in use at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:444) at sun.nio.ch.Net.bind(Net.java:436) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:914) ... 8 more 2016-06-05 01:03:14,809 INFO util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1 2016-06-05 01:03:14,812 INFO namenode.NameNode (LogAdapter.java:info(47)) - SHUTDOWN_MSG: /************************************************************
Created 06-05-2016 10:10 AM
This seems to be an issue with Ambari, Kindly restart the Ambari process and NN host to cleanup cache. Or else you can change the port 50070 to some other available port.
Curious to know your Ambari and HDP version?
Created 06-03-2016 11:23 PM
Any help would be appreciated...
Thanks in advance.
Created 06-05-2016 10:24 AM
Ambari 2.2.1.0 and HDP 2.3.4.0
and this issue got resolved with below steps
1) stopped ambari agent and ambari server
2) did NN format using --> hdfs namenode -format
3) provided 777 rights and changed the user to hadoop, for namenode folder resided in the below path
/bigdata/hadoop/hdfs/namenode
4) restarted the agent and server.