Created on 08-07-2017 10:14 AM
Symtoms :
NameNode HA states: active_namenodes =[], standby_namenodes =[], unknown_namenodes =[(u'nn1',
Solution :
Could be in order :
1) Ambari is doing the timeout ( 5 sec is default ) and killing the process if the NN takes long to start you can change the value of the timeout in
/var/lib/ambari-server/resources/common-services/HDFS/vXXXX/package/scripts/hdfs_namenode.py
From this:
@retry(times=5, sleep_time=5, backoff_factor=2, err_class=Fail)
To this:
@retry(times=25, sleep_time=25, backoff_factor=2, err_class=Fail)
if not enough to this:
@retry(times=50, sleep_time=25, backoff_factor=2, err_class=Fail)
2) Could be the Zookeeper not getting the status of the NN
for this you can try