Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

how to force name node to be active

avatar

in our ambari cluster both name node are like standby

42878-capture.png

in order to force one of them to be active we do

 hdfs haadmin -transitionToActive --forceactive master01 

Illegal argument: Unable to determine service address for namenode 'master01'

but we get - Unable to determine service address

what this is indicate ? and how to fix this issue ?

Michael-Bronson
38 REPLIES 38

avatar

on the namenode -format we get -

17/12/04 14:23:34 ERROR namenode.NameNode: Failed to start namenode.
java.io.IOException: Timed out waiting for response from loggers
Michael-Bronson

avatar
Master Mentor

@Michael Bronson

Can you attach your namenode log How's your /etc/hosts entry?

103.114.28.13  master01.sys4.com 
103.114.28.12 master03.sys4.com 

or IP/hostname /Alias

103.114.28.13  master01.sys4.com   master01
103.114.28.12 master03.sys4.com    master03 

What is the output of

$ zkCli.sh
[zk: localhost:2181(CONNECTED) 0] ls
/hadoop-ha

If this cluster is not critical then you might have to have to go through these steps

avatar

42902-capture.png

I do it on the first machine - master01 ( this is standby machine )

localhost:2181(CONNECTED) 2] ls /hadoop-ha

[hdfsha]

Michael-Bronson

avatar

my feeling is that no mater which namenode we start , every namenode became to standby and this is the big problem

Michael-Bronson

avatar

regarding to the host file , we not use it , we have DNS server , and all hosts are resolved . we already check that , and all ip's point to the right hostnames

Michael-Bronson

avatar
Master Mentor

@Michael Bronson

localhost:2181(CONNECTED) 2] ls /hadoop-ha
[hdfsha]

Next

localhost:2181(CONNECTED) 2] get /hadoop-ha/hdfsha/ActiveStandbyElectorLock

What output do you get ?

avatar
[zk: localhost:2181(CONNECTED) 6] ls /hadoop-ha/hdfsha/ActiveStandbyElectorLock
Node does not exist: /hadoop-ha/hdfsha/ActiveStandbyElectorLock
[zk
: localhost:2181(CONNECTED) 7] get /hadoop-ha/hdfsha/ActiveStandbyElectorLock
Node does not exist: /hadoop-ha/hdfsha/ActiveStandbyElectorLock [zk:
localhost:2181(CONNECTED) 8] ls /hadoop-ha/hdfsha
[] [zk: localhost:2181(CONNECTED) 9]
Michael-Bronson

avatar
Master Mentor

@Michael Bronson

Can you delete the entry in zookeeper and restart

[zk: localhost:2181(CONNECTED) 1] rmr /hadoop-ha

Validate that there is no hadoop-ha entry,

[zk: localhost:2181(CONNECTED) 2] ls /

Then restart the all components HDFS service. This will create a new ZNode with correct lock(of Failover controller).

Also see https://community.hortonworks.com/questions/12942/how-to-clean-up-files-in-zookeeper-directory.html#

avatar

we removed the /hadoop-ha from master01 machine , ans restart HDFS , I will update soon , from my expiriance its takes time around 20min

Michael-Bronson

avatar

this is the status after full HDFS restart as you see we get both standby -:(

42904-capture.png

Michael-Bronson