Created 07-09-2020 11:07 PM
Hi Team,
We have a 2 node cluster setup of a Namenode and a Datanode.
The name node is refusing to start and is failing because of following error.
Retrying after 10 seconds. Reason: Execution of '/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs hdfs://inqchdpmn1.XXX.com:8020 -safemode get | grep 'Safe mode is OFF'' returned 1. safemode: Call From inqchdpmn1.XXX.com/10.10.31.71 to inqchdpmn1.XXX.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
safemode: Call From inqchdpmn1.XXX.com/10.10.31.71 to inqchdpmn1.XXX.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
2020-07-10 11:08:55,721 -
bash-4.2$ sudo netstat -anp | grep 8020
doesn't give any o/p.
We are stuck for a while now, please help.
TIA.
Created 07-10-2020 02:56 AM
The "safemode get | grep 'Safe mode is OFF'' returned 1" means the Namenode is not started at all or in safe mode or in safe mode
Could you do the following while logged on as hdfs user
$ hdfs dfsadmin -safemode get
If you see something like below
safemode: Call From inqchdpmn1.XXX.com/10.10.31.71 to inqchdpmn1.XXX.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
Then the Namenode is not running the check and upload the hadoop-hdfs-secondarynamenode-xx and hadoop-hdfs-datanode-xxxlogs in /var/log/hadoop/hdfs/
Solution 2
Try starting the HDP namenode service manually
Share your findings
Created 07-10-2020 04:20 AM
Created 07-15-2020 03:39 AM
Created 07-15-2020 03:47 PM
Unfortunately your screenshot is invisible
Created 07-15-2020 10:02 PM
The netstat output was empty because no process is listening to your port.
The above screenshot says it's warning. Did that application started ??
Which version you are using so that we could provide you the excat documentation link.
Let me know your insight.
Created 07-16-2020 12:17 AM
Created 07-16-2020 01:23 AM
@ARVINDR :
Do u still face issue ? Could you send the below things:
1. When u start the NN i need console logs(screen logs or ambari logs while you see the operation)
2. I need the NN logs.
Created 07-16-2020 04:25 AM
PFA NN logs.
Ambari error is as under
Connection failed to http://xxxxx.com:50070(<urlopen error [Errno 111] Connection refused>)
Created 07-16-2020 06:19 AM
@ARVINDR :
seems to be acl issues :
1.pls do a directory listing from hdfs which the lock file is refering too.
2. Is that a running cluster or creating the cluster from the scratch ?