Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Resource Manager HA and Yarn service-check is failing

avatar
Expert Contributor

We enabled Yarn Resource Manager HA on our cluster ( HDP 2.3.2 and Ambari 2.1.2.1) and it was working fine until we re-installedRanger KMS server from the cluster. When the ResrouceManager HA was working, I saw one of them as active Resource Manager and the other one as Stand-by but they both are now showing as ResourceManager and also when I run the service check on yarn .. it is failing with the following error message

2998-rm.png

Traceback (most recent call last):
  File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/service_check.py", line 142, in <module>
    ServiceCheck().execute()
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 216, in execute
    method(env)
  File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/service_check.py", line 138, in service_check
    raise Exception("Could not get json response from YARN API")
Exception: Could not get json response from YARN API
1 ACCEPTED SOLUTION

avatar
Master Guru

@rbalam - It could be possible that both the resource managers are in standby or active state. can you please run below commands and check?

sudo -u yarn yarn rmadmin -getServiceState rm1

sudo -u yarn yarn rmadmin -getServiceState rm2

If you find that both the RMs are in standby state then you can initiate manual failover using below command

sudo -u yarn yarn rmadmin -transitionToActive --forcemanual rm1

View solution in original post

3 REPLIES 3

avatar
Master Guru

@rbalam - It could be possible that both the resource managers are in standby or active state. can you please run below commands and check?

sudo -u yarn yarn rmadmin -getServiceState rm1

sudo -u yarn yarn rmadmin -getServiceState rm2

If you find that both the RMs are in standby state then you can initiate manual failover using below command

sudo -u yarn yarn rmadmin -transitionToActive --forcemanual rm1

avatar
Expert Contributor

@Kuldeep Kulkarni

Thanks for the update.This issue was resolved by Hortonworks Support Engineering. I am trying to catching up with what they have done to fix the issue.

avatar
Super Collaborator

@Kuldeep Kulkarni I got the same error message. only difference is my env is kerborized & both my rm's are not in standby mode.

[yarn@m1 root]$ yarn rmadmin -getServiceState rm1

standby

[yarn@m1 root]$ yarn rmadmin -getServiceState rm2

active

Ambari doesn't show the state of RM, but getting same exception as above. i tried to switch the roles and that didnot help. Any help is appreciated.