I was trying to configure the HDFS HA federated cluster, and I successfully did. But now issue I am facing is that whenever I stop the active namenode to test the HA failover, the standby namenode doesn't get active until I start the active namenode again. Once I start the previous active namenode again, than the passive one becomes active.
Its weird I mean. Previously I have configured this HA cluster using Ambari and tested the same scenario, but It was working fine. active/passive namenode was automatically elected immediately on failing.
Can anyone please guide me here ?
Can you please share you hdfs-site.xml settings? HA in federated cluster means you have four name nodes, right (assuming two namespaces)? Are you using ViewFS? Can you also please share your core-site.xml? How much data do you have? Why is HDFS federation required?
Check the following link.
@mqureshi : Yes your assumption is completely right.
Here is my core-site.xml :
<configuration> <property> <name>fs.defaultFS</name> <value>viewfs://uatCluster</value> </property> <property> <name>fs.viewfs.mounttable.default.link/hadoop4ind.india</name> <value>hdfs://hadoop4ind.india</value> </property> <property> <name>fs.viewfs.mounttable.default.link/hadoop5ind.india</name> <value>hdfs://hadoop5ind.india</value> </property> <property> <name>dfs.journalnode.edits.dir</name> <value>/home/hdfsdata/journalnode</value> </property> </configuration>
@Kuldeep Kulkarni : Fencing method used is sshfence
I couldn't find zkfc log, I have setup this too quick so I don't remember providing such log path for zookeeper. Though I'll check and let u know it its helpful
@Viraj Vekaria - If you are using ssh fencing then your namenode host needs to be up and running in order to get fencing work successfully. You can configure another fencing method to avoid this issue.
Please refer below link