Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Yarn HA does not work (both Resource manager stays in standby state)

Yarn HA does not work (both Resource manager stays in standby state)

Explorer

Hello

 

We have a pretty old CDH 5.7 cluster that works fine. But when we try to add a second Resource manager and enable high availability, both RM's remain in standby state and there is no active one.

This seems to be a known issue and the suggested fix is to run "yarn resourcemanager -format-state-store". Cloudera itself recoomends it here (search for "standby") and so does other articles on the web. However, running this and restarting the RM's did not solve our problem.

I also couldn't find anything special in the logs, and to make things even more strange, we have another 5.7 cluster where we successfuly enabled YARN high availability without issues.

 

Does anyone have an idea what's wrong ? Did anyone have such issue ?

 

Thanks

 

Guy

3 REPLIES 3
Highlighted

Re: Yarn HA does not work (both Resource manager stays in standby state)

Master Guru
The quoted documentation also indicates that the specific issue that required that format was resolved in CDH 5.2.1 onwards, so you shouldn't necessarily be running that as a fix for your problem.

The RMs in HA mode run an election after they are both up, with logs from classes org.apache.hadoop.ha.ActiveStandbyElector, org.apache.hadoop.yarn.server.resourcemanager.ResourceManager and org.apache.zookeeper.ZooKeeper helping detail its process.

I'd advise checking the logs for these classes and try to spot what the failure is. It may be ZK related or some other configuration. Alternatively share the RM logs via pastebin/etc..

Re: Yarn HA does not work (both Resource manager stays in standby state)

Expert Contributor

Hi @ni4ni

I think the sulotion is to format the RMStateStore :

yarn resourcemanager -format-state-store

source: https://stackoverflow.com/questions/39369149/resource-manager-does-not-transit-to-active-state-from-...

Re: Yarn HA does not work (both Resource manager stays in standby state)

While doing the manual failover in resource manager my schduled and running application id's will it move to stanby resource manager buy how? In hdfs name node journal nodes are monitoring the edit logs. In resource manager which daemon is monitoring?