Support Questions

Find answers, ask questions, and share your expertise

how to force name node to be active

avatar

in our ambari cluster both name node are like standby

42878-capture.png

in order to force one of them to be active we do

 hdfs haadmin -transitionToActive --forceactive master01 

Illegal argument: Unable to determine service address for namenode 'master01'

but we get - Unable to determine service address

what this is indicate ? and how to fix this issue ?

Michael-Bronson
38 REPLIES 38

avatar

on the namenode -format we get -

17/12/04 14:23:34 ERROR namenode.NameNode: Failed to start namenode.
java.io.IOException: Timed out waiting for response from loggers
Michael-Bronson

avatar
Master Mentor

@Michael Bronson

Can you attach your namenode log How's your /etc/hosts entry?

103.114.28.13  master01.sys4.com 
103.114.28.12 master03.sys4.com 

or IP/hostname /Alias

103.114.28.13  master01.sys4.com   master01
103.114.28.12 master03.sys4.com    master03 

What is the output of

$ zkCli.sh
[zk: localhost:2181(CONNECTED) 0] ls
/hadoop-ha

If this cluster is not critical then you might have to have to go through these steps

avatar

42902-capture.png

I do it on the first machine - master01 ( this is standby machine )

localhost:2181(CONNECTED) 2] ls /hadoop-ha

[hdfsha]

Michael-Bronson

avatar

my feeling is that no mater which namenode we start , every namenode became to standby and this is the big problem

Michael-Bronson

avatar

regarding to the host file , we not use it , we have DNS server , and all hosts are resolved . we already check that , and all ip's point to the right hostnames

Michael-Bronson

avatar
Master Mentor

@Michael Bronson

localhost:2181(CONNECTED) 2] ls /hadoop-ha
[hdfsha]

Next

localhost:2181(CONNECTED) 2] get /hadoop-ha/hdfsha/ActiveStandbyElectorLock

What output do you get ?

avatar
[zk: localhost:2181(CONNECTED) 6] ls /hadoop-ha/hdfsha/ActiveStandbyElectorLock
Node does not exist: /hadoop-ha/hdfsha/ActiveStandbyElectorLock
[zk
: localhost:2181(CONNECTED) 7] get /hadoop-ha/hdfsha/ActiveStandbyElectorLock
Node does not exist: /hadoop-ha/hdfsha/ActiveStandbyElectorLock [zk:
localhost:2181(CONNECTED) 8] ls /hadoop-ha/hdfsha
[] [zk: localhost:2181(CONNECTED) 9]
Michael-Bronson

avatar
Master Mentor

@Michael Bronson

Can you delete the entry in zookeeper and restart

[zk: localhost:2181(CONNECTED) 1] rmr /hadoop-ha

Validate that there is no hadoop-ha entry,

[zk: localhost:2181(CONNECTED) 2] ls /

Then restart the all components HDFS service. This will create a new ZNode with correct lock(of Failover controller).

Also see https://community.hortonworks.com/questions/12942/how-to-clean-up-files-in-zookeeper-directory.html#

avatar

we removed the /hadoop-ha from master01 machine , ans restart HDFS , I will update soon , from my expiriance its takes time around 20min

Michael-Bronson

avatar

this is the status after full HDFS restart as you see we get both standby -:(

42904-capture.png

Michael-Bronson