I have setup HDFS in HA environment, I have question that is zookeeper required to install on all nodes(namenode-datanode) ?
@Edgar Daeds, I am using ambari and It shows me 1 zookeeper client installed.I think client need to be installed because how would zookeeper know which namenode is active or standby.
See something strange happened with my cluster.When connectivity is lost between two namenode machine In that case If last standby namenode is on same machine on which ambari is installed than it'll active that namenode as soon as connectivity lost.But now what happens when standby namenode is on other machine on which ambari isn't installed is strange. It remains always standby after connectivity lost. Am I missing something in HA cluster setup ?
Its tough to explain the scenario though I am trying my best,
I have set up namenode HA using Ambari, My active namenode is on machine where ambari installed alongside zookeeper client & zookeeper server. Standby namenode is on another machine where zookeeper client is not installed but zookeeper server is installed. Now when n/w connectivity between this 2 machines goes offline, The Passive namenode don't turned into active mode or what should be the behaviour in this case as already 1 active namenode is there in cluster.
So my question is, zookeeper client is the one who tells the namenode to be active/passive?
Soo the zookeeperfailover controller tells the namenode to become active/passive. You should have three Journalnodes and two zookeeper failover controllers.
Now if they need zookeeper client installed? Not sure, they will not use the client command line utils ( zkCli.sh etc. ) But they will need zookeeper jars. However they might have them in their own lib folder. Or depend on the zookeeper client to provide them. I have seen both approaches.
Normally Ambari installs all needed clients during an install but it has been known to forget one before. So if you want to make sure, install the client from the host page. ( +Add button on the webpage of the host )
But I think its unlikely that this is the problem. You should see some Classnotfoundexceptions somewhere ( in the zookeeper failover controller logs )