Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Resource Manager HA and Yarn service-check is failing

avatar
Expert Contributor

We enabled Yarn Resource Manager HA on our cluster ( HDP 2.3.2 and Ambari 2.1.2.1) and it was working fine until we re-installedRanger KMS server from the cluster. When the ResrouceManager HA was working, I saw one of them as active Resource Manager and the other one as Stand-by but they both are now showing as ResourceManager and also when I run the service check on yarn .. it is failing with the following error message

2998-rm.png

Traceback (most recent call last):
  File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/service_check.py", line 142, in <module>
    ServiceCheck().execute()
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 216, in execute
    method(env)
  File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/service_check.py", line 138, in service_check
    raise Exception("Could not get json response from YARN API")
Exception: Could not get json response from YARN API
1 ACCEPTED SOLUTION

avatar
Master Guru

@rbalam - It could be possible that both the resource managers are in standby or active state. can you please run below commands and check?

sudo -u yarn yarn rmadmin -getServiceState rm1

sudo -u yarn yarn rmadmin -getServiceState rm2

If you find that both the RMs are in standby state then you can initiate manual failover using below command

sudo -u yarn yarn rmadmin -transitionToActive --forcemanual rm1

View solution in original post

3 REPLIES 3

avatar
Master Guru

@rbalam - It could be possible that both the resource managers are in standby or active state. can you please run below commands and check?

sudo -u yarn yarn rmadmin -getServiceState rm1

sudo -u yarn yarn rmadmin -getServiceState rm2

If you find that both the RMs are in standby state then you can initiate manual failover using below command

sudo -u yarn yarn rmadmin -transitionToActive --forcemanual rm1

avatar
Expert Contributor

@Kuldeep Kulkarni

Thanks for the update.This issue was resolved by Hortonworks Support Engineering. I am trying to catching up with what they have done to fix the issue.

avatar
Super Collaborator

@Kuldeep Kulkarni I got the same error message. only difference is my env is kerborized & both my rm's are not in standby mode.

[yarn@m1 root]$ yarn rmadmin -getServiceState rm1

standby

[yarn@m1 root]$ yarn rmadmin -getServiceState rm2

active

Ambari doesn't show the state of RM, but getting same exception as above. i tried to switch the roles and that didnot help. Any help is appreciated.