About flyflap

flyflap · ‎11-02-2014

If you start the cluster manually (so you start all services manually) you dont even need the slaves file and ssh. This is only needed if you use the start scripts. The datanode will contact its configured namenode on startup and "offer" his service. So if you started the datanode manually, you should check its log file and see why he is not able to contact the namenode. br

abhishes · ‎07-29-2014

Thank you so much. Your answer is absolutely correct. I went to each server and did nn1: service zookeeper-server init --myid=1 --force nn2: service zookeeper-server init --myid=2 --force jt1: service zookeeper-server init --myid=3 --force earlier I had chosen an ID of 1 on every machine. I also corrected my zoo.cfg. to ensure right entries. Now it works and I am able to do sudo -u hdfs hdfs zkfc -formatZK Thank you so much!

flyflap · ‎01-22-2014

Thanks for your reply. The facebook link was interesting to read.Unfortunately we have it a bit more complicated, since we are developing a product which gets installed in customers datacenters and has to work with minimal manual interaction without losing any data. (You want your mobile phone bills to be correct 😉 ) If we go down that road, I would indeed follow your advise to shut the replication down on the surviving cluster and use snapshots to restore the failed cluster when it comes back online. Regards Marc

Online	Offline
Last Visited	‎04-05-2016 04:34 AM

Member Since	‎01-18-2014 12:07 PM
Last Visited	‎04-05-2016 04:34 AM
Posts	12
Kudos received	2

Cloudera Community

Re: CDH4 : Add new node to existing cluster

Re: FATAL ha.ZKFailoverController: Unable to start...

Re: CDH4 : Add new node to existing cluster

Re: FATAL ha.ZKFailoverController: Unable to start...

Re: Disaster Recovery questions