Created 10-15-2017 07:12 PM
I have setup single node HDP2.6 deployed using Ambari.
My single node hostname is nhknox-Virtual-Machine.mad.lab
I have KDC server setup at Domain Controller as Active Directory Domain Services.
Now I am enabling Kerberos, which is going through fine. Then, on ambari wizard it stopping all the services, which is fine too. Then, it is trying to start all the services after kerberized hadoop.
There my Namenode is not starting and below are the logs for the same. Please suggest for the solution
stderr:
stdout:
2017-10-15 13:58:14,561 - Execute['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'ulimit -c unlimited ; /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh --config /usr/hdp/current/hadoop-client/conf start namenode''] {'environment': {'HADOOP_LIBEXEC_DIR': '/usr/hdp/current/hadoop-client/libexec'}, 'not_if': 'ambari-sudo.sh -H -E test -f /var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid && ambari-sudo.sh -H -E pgrep -F /var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid'}
2017-10-15 13:58:18,931 - Execute['/usr/bin/kinit -kt /etc/security/keytabs/hdfs.headless.keytab hdfs-testknox@MAD.LAB'] {'user': 'hdfs'}
2017-10-15 13:58:19,157 - Waiting for this NameNode to leave Safemode due to the following conditions: HA: False, isActive: True, upgradeType: None
2017-10-15 13:58:19,157 - Waiting up to 19 minutes for the NameNode to leave Safemode...
2017-10-15 13:58:19,158 - Execute['/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs hdfs://nhknox-virtual-machine.mad.lab:8020 -safemode get | grep 'Safe mode is OFF''] {'logoutput': True, 'tries': 115, 'user': 'hdfs', 'try_sleep': 10}
safemode: Call From nhknox-virtual-machine.mad.lab/127.0.1.1 to nhknox-virtual-machine.mad.lab:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
2017-10-15 13:58:44,683 - Retrying after 10 seconds. Reason: Execution of '/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs hdfs://nhknox-virtual-machine.mad.lab:8020 -safemode get | grep 'Safe mode is OFF'' returned 1. safemode: Call From nhknox-virtual-machine.mad.lab/127.0.1.1 to nhknox-virtual-machine.mad.lab:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
2017-10-15 13:59:16,062 - Retrying after 10 seconds. Reason: Execution of '/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs hdfs://nhknox-virtual-machine.mad.lab:8020 -safemode get | grep 'Safe mode is OFF'' returned 1.
2017-10-15 13:59:45,987 - Retrying after 10 seconds. Reason: Execution of '/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs hdfs://nhknox-virtual-machine.mad.lab:8020 -safemode get | grep 'Safe mode is OFF'' returned 1.
2017-10-15 14:00:15,016 - Retrying after 10 seconds. Reason: Execution of '/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs hdfs://nhknox-virtual-machine.mad.lab:8020 -safemode get | grep 'Safe mode is OFF'' returned 1.
Created 10-16-2017 02:16 AM
Is your hostname exactly "nhknox-Virtual-Machine.mad.lab" or "nhknox-virtual-machine.mad.lab". We have seen issues where hostname in Mixed case has caused issues. If that is the case for you, change the hostname to all small letters and restart ambari server and then try restarting the namenode.
If your hostname has all small letters, can you please paste your namenode logs under (/var/log/hadoop-hdfs/hdfs)
Thanks,
Aditya
Created 10-16-2017 04:13 AM
Glad that it worked for you. Can you please accept the answer and start a new thread for this so that the main thread doesn't get deviated. Please share more logs related to the alerts in the new thread.
Thanks,
Aditya