Reply
Explorer
Posts: 21
Registered: ‎03-15-2016

Yarn HA does not work (both Resource manager stays in standby state)

Hello

 

We have a pretty old CDH 5.7 cluster that works fine. But when we try to add a second Resource manager and enable high availability, both RM's remain in standby state and there is no active one.

This seems to be a known issue and the suggested fix is to run "yarn resourcemanager -format-state-store". Cloudera itself recoomends it here (search for "standby") and so does other articles on the web. However, running this and restarting the RM's did not solve our problem.

I also couldn't find anything special in the logs, and to make things even more strange, we have another 5.7 cluster where we successfuly enabled YARN high availability without issues.

 

Does anyone have an idea what's wrong ? Did anyone have such issue ?

 

Thanks

 

Guy

Highlighted
Posts: 1,673
Kudos: 330
Solutions: 263
Registered: ‎07-31-2013

Re: Yarn HA does not work (both Resource manager stays in standby state)

The quoted documentation also indicates that the specific issue that required that format was resolved in CDH 5.2.1 onwards, so you shouldn't necessarily be running that as a fix for your problem.

The RMs in HA mode run an election after they are both up, with logs from classes org.apache.hadoop.ha.ActiveStandbyElector, org.apache.hadoop.yarn.server.resourcemanager.ResourceManager and org.apache.zookeeper.ZooKeeper helping detail its process.

I'd advise checking the logs for these classes and try to spot what the failure is. It may be ZK related or some other configuration. Alternatively share the RM logs via pastebin/etc..
Expert Contributor
Posts: 115
Registered: ‎07-17-2017

Re: Yarn HA does not work (both Resource manager stays in standby state)

Hi @ni4ni

I think the sulotion is to format the RMStateStore :

yarn resourcemanager -format-state-store

source: https://stackoverflow.com/questions/39369149/resource-manager-does-not-transit-to-active-state-from-...

Announcements