Created on 08-01-2017 03:30 PM - edited 09-16-2022 05:02 AM
It has been observed that nodemanger got started successfully, but died after 8-10 minute with follow error:
2017-08-01 08:34:55,349 INFO client.ConfiguredRMFailoverProxyProvider (ConfiguredRMFailoverProxyProvider.java:performFailover(100)) - Failing over to rm1 2017-08-01 08:34:55,373 WARN retry.RetryInvocationHandler (RetryInvocationHandler.java:handleException(217)) - Exception while invoking ResourceTrackerPBClientImpl.registerNodeManager over rm1. Not retrying because failovers (30) exceeded maximum allowed (30) 2017-08-01 08:34:55,373 ERROR nodemanager.NodeStatusUpdaterImpl (NodeStatusUpdaterImpl.java:serviceStart(229)) - Unexpected error starting NodeStatusUpdater 2017-08-01 08:34:55,373 INFO service.AbstractService (AbstractService.java:noteFailure(272)) - Service org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl failed in state STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: org.apache.hadoop.security.authorize.AuthorizationException: User nm/zaldn8.r1-core.r1.zal.net@DEV.HADOOP.R1-CORE.R1.ZAL.NET (auth:KERBEROS) is not authorized for protocol interface org.apache.hadoop.yarn.server.api.ResourceTrackerPB: this service is only accessible by nm/172.20.176.119@DEV.HADOOP.R1-CORE.R1.ZAL.NET org.apache.hadoop.yarn.exceptions.YarnRuntimeException: org.apache.hadoop.security.authorize.AuthorizationException: User nm/zaldn8.r1-core.r1.zal.net@DEV.HADOOP.R1-CORE.R1.ZAL.NET (auth:KERBEROS) is not authorized for protocol interface org.apache.hadoop.yarn.server.api.ResourceTrackerPB: this service is only accessible by nm/172.20.176.119@DEV.HADOOP.R1-CORE.R1.ZAL.NET at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:230) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:302) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:547) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:594) Caused by: org.apache.hadoop.security.authorize.AuthorizationException: User nm/zaldn8.r1-core.r1.zal.net@DEV.HADOOP.R1-CORE.R1.ZAL.NET (auth:KERBEROS) is not authorized for protocol interface org.apache.hadoop.yarn.server.api.ResourceTrackerPB: this service is only accessible by nm/172.20.176.119@DEV.HADOOP.R1-CORE.R1.ZAL.NET at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) at org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:104) at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:70) at sun.reflect.GeneratedMethodAccessor28.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:278) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:194) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:176) at com.sun.proxy.$Proxy84.registerNodeManager(Unknown Source) at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:305) at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:224) ... 6 more
It seems like this is DNS issue, but hostname -f command return correct hostname. Do you have suggestion how to resolve the issue.
Created 08-02-2017 05:22 AM
I found the solution.
There was incorrect hostname entry on /etc/hosts file on Resource manager node and as result nodemanager registration failed as resource manager does not accept request from a unauthorized host.
Thanks
Khireswar