Support Questions

mike_bronson7 · ‎12-04-2017

in our ambari cluster both name node are like standby

in order to force one of them to be active we do

 hdfs haadmin -transitionToActive --forceactive master01 

Illegal argument: Unable to determine service address for namenode 'master01'

but we get - Unable to determine service address

what this is indicate ? and how to fix this issue ?

Michael-Bronson

mike_bronson7 · ‎12-04-2017

on the namenode -format we get -

17/12/04 14:23:34 ERROR namenode.NameNode: Failed to start namenode.
java.io.IOException: Timed out waiting for response from loggers

Michael-Bronson

Shelton · ‎12-04-2017

@Michael Bronson

Can you attach your namenode log How's your /etc/hosts entry?

103.114.28.13  master01.sys4.com 
103.114.28.12 master03.sys4.com

or IP/hostname /Alias

103.114.28.13  master01.sys4.com   master01
103.114.28.12 master03.sys4.com    master03

What is the output of

$ zkCli.sh
[zk: localhost:2181(CONNECTED) 0] ls 
/hadoop-ha

If this cluster is not critical then you might have to have to go through these steps

mike_bronson7 · ‎12-04-2017

I do it on the first machine - master01 ( this is standby machine )

localhost:2181(CONNECTED) 2] ls /hadoop-ha

[hdfsha]

Michael-Bronson

mike_bronson7 · ‎12-04-2017

my feeling is that no mater which namenode we start , every namenode became to standby and this is the big problem

Michael-Bronson

mike_bronson7 · ‎12-04-2017

regarding to the host file , we not use it , we have DNS server , and all hosts are resolved . we already check that , and all ip's point to the right hostnames

Michael-Bronson

Shelton · ‎12-04-2017

@Michael Bronson

localhost:2181(CONNECTED) 2] ls /hadoop-ha
[hdfsha]

localhost:2181(CONNECTED) 2] get /hadoop-ha/hdfsha/ActiveStandbyElectorLock

What output do you get ?

mike_bronson7 · ‎12-04-2017

[zk: localhost:2181(CONNECTED) 6] ls /hadoop-ha/hdfsha/ActiveStandbyElectorLock
Node does not exist: /hadoop-ha/hdfsha/ActiveStandbyElectorLock
[zk
: localhost:2181(CONNECTED) 7] get /hadoop-ha/hdfsha/ActiveStandbyElectorLock 
Node does not exist: /hadoop-ha/hdfsha/ActiveStandbyElectorLock
[zk: 
localhost:2181(CONNECTED) 8] ls /hadoop-ha/hdfsha 
[]
[zk: localhost:2181(CONNECTED) 9]

Michael-Bronson

Shelton · ‎12-04-2017

@Michael Bronson

Can you delete the entry in zookeeper and restart

[zk: localhost:2181(CONNECTED) 1] rmr /hadoop-ha

Validate that there is no hadoop-ha entry,

[zk: localhost:2181(CONNECTED) 2] ls /

Then restart the all components HDFS service. This will create a new ZNode with correct lock(of Failover controller).

Also see https://community.hortonworks.com/questions/12942/how-to-clean-up-files-in-zookeeper-directory.html#

mike_bronson7 · ‎12-04-2017

we removed the /hadoop-ha from master01 machine , ans restart HDFS , I will update soon , from my expiriance its takes time around 20min

Michael-Bronson

mike_bronson7 · ‎12-04-2017

this is the status after full HDFS restart as you see we get both standby -:(

Michael-Bronson

Cloudera Community

Support Questions

how to force name node to be active