PROBLEM: Unable to start Resource Manager which fails with below errors:-
STARTUP_MSG: build = git@github.com:hortonworks/hadoop.git -r 9e75108092247d96ce7d70839b6945e9eba2a0b7; compiled by 'jenkins' on 2014-11-04T04:31ZSTARTUP_MSG: java = 1.7.0_67************************************************************/2014-11-04 08:41:08,705 INFO resourcemanager.ResourceManager (SignalLogger.java:register(91)) - registered UNIX signal handlers for [TERM, HUP, INT]2014-11-04 08:41:10,636 INFO service.AbstractService (AbstractService.java:noteFailure(272)) - Service ResourceManager failed in state INITED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to login org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to login at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:211)at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1229) Caused by: java.io.IOException: Login failure for rm/ip-172-31-32-22.ec2.internal@EXAMPLE.COM from keytab /etc/security/keytabs/rm.service.keytab: javax.security.auth.login.LoginException: Unable to obtain password from user at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:935)at org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:243)at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.doSecureLogin(ResourceManager.java:1109)at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:209)... 2 more Caused by: javax.security.auth.login.LoginException: Unable to obtain password from user.. 2014-11-04 08:41:10,641 INFO resourcemanager.ResourceManager (ResourceManager.java:transitionToStandby(1077)) - Transitioning to standby state 2014-11-04 08:41:10,642 INFO resourcemanager.ResourceManager (ResourceManager.java:transitionToStandby(1087)) - Transitioned to standby state 2014-11-04 08:41:10,643 FATAL resourcemanager.ResourceManager (ResourceManager.java:main(1233)) - Error starting ResourceManagerorg.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to loginat org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:211)at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1229)Caused by: java.io.IOException: Login failure for rm/ip-172-31-32-22.ec2.internal@EXAMPLE.COM from keytab /etc/security/keytabs/rm.service.keytab: javax.security.auth.login.LoginException: Unable to obtain password from userat org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:935) at org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:243)at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.doSecureLogin(ResourceManager.java:1109)at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:209)... 2 more Caused by: javax.security.auth.login.LoginException: Unable to obtain password from user
ROOT CAUSE: This issue is caused because active RM is using user principal of other standby RM and vice versa.This is reported in bug YARN-2805 , HDP bug BUG-26831. The bugs have been resolved now.
SOLUTION: If you are on HDP 2.2.0 , raise a support case with HWX to get a Hotfix.
WORKAROUND: Hardcode the principal entry “rm/_HOST@EXAMPLE.COM" in Yarn configuration in Ambari by replacing “_HOST” part with actual hostname of active and standby resource manager respectively.