Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Problems with ZooKeeper

Problems with ZooKeeper

Explorer

Is it possible to install ZooKeeper via CM after the initial installation?  I am trying to start the "failoverController" to enable auto failover, but CM keeps sqawking about ZooKeeper not being available even though I manuall installed ZK via the yum repository on the 3 nodes.

 

It is running on the 3 nodes as leader,follower,follower, but for some reason CM doesn't recognize it's installed.  I thought maybe it has to be "registered" with CM in some way.  (Just FYI, I tried 3 times to install ZK during the initial CM install but it would never start correctly even though there were zero errors during the inspection)

 

# /usr/lib/zookeeper/bin/zkServer.sh status
JMX enabled by default
Using config: /usr/lib/zookeeper/bin/../conf/zoo.cfg
Mode: leader

22 REPLIES 22

Re: Problems with ZooKeeper

Hi Mike,

 

Did you try adding a ZooKeeper service to your cluster? From the home page, click the arrow on the same line as your cluster name, then click on Add Service.

 

If you want other services to use that ZooKeeper, don't forget to update their configuration as well.

 

Thanks,

Darren

Re: Problems with ZooKeeper

Explorer

Thanks, I had already started over by the time I got your repsonse, but I did not see that option earlier.... so good to know it's there. 

 

By the way, toward the end of a fresh install before CM tries to start all of the services, I have to go chown all of the directories on all nodes to the correct users:groups because they are all owned by root.  This causes the services to fail when starting.

 

For example, the namenode will fail to format because the dfs directory is not owned by hdfs user, and ZK will fail to start because dir is not owned by zookeeper.  Perhaps a bug?

Re: Problems with ZooKeeper

You shouldn't need to chown any dirs. Which dirs did you need to chown? How did you install CDH? Do you have an abnormal default umask?

Re: Problems with ZooKeeper

Explorer

I'm doing a fresh install of CDH via CM on 5 fresh minimal CentOS 6.4 VM's.  My dirs are set up like:

 

/data/01dfs

/data/01/zookeeper

/data/01/mapred

/data/01/local

/data/01/sqoop

 

At the end of the CM install after setting all of the directory paths for each role/service, the directories are still owned by root.  If I click "Continue", CM will attempt to start the services and the NN will fail to format and ZooKeeper will fail to start.  So before I click Continue, I have to go to the CLI and:

 

chown -R hdfs:hdfs /data/01/dfs

chown -R hdfs:hdfs /data/01/local

chown -R mapred:mapred /data/01mapred

chown -R zookeeper:zookeeper /data/01/zookeeper

Re: Problems with ZooKeeper

Hi Mike,

 

I see what's happening. It sounds like the dirs were created before the configuration was passed to CM. If you are going to be manually creating the dirs, then CM assumes you'll set them up with the right perms and won't chown them. This is to help make sure we don't accidentally break something if you made a mistake.

 

If you want to have CM set this all up for you, then you could create /data/01/, but let the leaf directories be created by CM.

 

Thanks,

Darren

Re: Problems with ZooKeeper

Explorer

Okay I see. That makes since, but it would be nice to add that to the documentation :-)

 

I think I have it all working nicely now and in good health except for the failoverControllers won't start for some reason. I have HA and auto-failover enable, but the 3 failOverControllers are stopped.  Any idea how to address these errors when trying to start them?

 

FATALorg.apache.hadoop.ha.ZKFailoverController
Unable to start failover controller. Parent znode does not exist.
Run with -formatZK flag to initialize ZooKeeper

 

And this is from the Stderr log

Exception in thread "main" org.apache.hadoop.HadoopIllegalArgumentException: Configuration has no addresses that match local node's address. Please configure the system with mapred.ha.jobtracker.id
	at org.apache.hadoop.mapred.HAUtil.getSuffixIDs(HAUtil.java:241)
	at org.apache.hadoop.mapred.HAUtil.getJobTrackerId(HAUtil.java:158)
	at org.apache.hadoop.mapred.tools.MRZKFailoverController.create(MRZKFailoverController.java:123)
	at org.apache.hadoop.mapred.tools.MRZKFailoverController.main(MRZKFailoverController.java:172)

Re: Problems with ZooKeeper

Hi Mike,

 

Did you manually add the failover controllers, or did you use the wizard to enable automatic failover?

http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM4Ent/latest/Cloudera-Manager-Managi...

 

The wizard should have done the zookeeper initialization for you.

 

Thanks,

Darren

Highlighted

Re: Problems with ZooKeeper

Explorer

I installed ZK via CM during the initial install and enabled HA via the wizard.  It's odd because the FailoverControllers for the HDFS service shows as started, but the FailoverControllers assigned to the MapReduce service shows stopped/down (hence the error)

Re: Problems with ZooKeeper

Sorry, I missed that you were talking about the MR failover controllers. Did you go through the Job Tracker High Availability Wizard? This should have initialized the mapreduce zk failover controllers (which have separate ZooKeeper data from the HDFS ones).

 

http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM4Ent/latest/Cloudera-Manager-Managi...

Don't have an account?
Coming from Hortonworks? Activate your account here