Created 08-03-2018 12:42 AM
I have installed these components using Ambari install:
HDFS, YARN, MapReduce2, Tez, Hive, ZooKeeper, Spark2
When I let Ambari start all services, HDFS, MapReduce2, ZooKeeper, and YARN started successfully, but the procedure is stuck at "Start Hive Metastore". Task log attached at the end, but I think the critical lines are:
2018-08-02 16:36:00,100 - Execute['yarn rmadmin -refreshSuperUserGroupsConfiguration'] {'user': 'yarn'} Traceback (most recent call last): File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/HIVE/package/scripts/hive_metastore.py", line 200, in HiveMetastore().execute() (omitted many lines) File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 314, in _call raise ExecutionFailed(err_msg, code, out, err) resource_management.core.exceptions.ExecutionFailed: Execution of 'yarn rmadmin -refreshSuperUserGroupsConfiguration' returned 255. 18/08/02 16:36:01 INFO client.RMProxy: Connecting to ResourceManager at vm-097/10.100.1.161:8141
In my setting, `vm-097` runs the ResourceManager, and the Hive server is on another machine `vm-100`. Both are virtual machines running Ubuntu 16.04 on Windows hosts. I went to `vm-100` and ran the `yarn rmadmin -refreshSuperUserGroupsConfiguration`, it shows similar errors
yarn@vm-100:~$ yarn rmadmin -refreshSuperUserGroupsConfiguration 18/08/02 17:04:56 INFO client.RMProxy: Connecting to ResourceManager at vm-097/10.100.1.161:8141 18/08/02 17:04:56 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active! (many lines omitted)
I googled for `yarn rmadmin` and did not find much helpful info. Hope someone here could help. Thanks!
Attachment 1: Ambari "Hive Metastor Stat" task log
stderr: Traceback (most recent call last): File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/HIVE/package/scripts/hive_metastore.py", line 200, in HiveMetastore().execute() File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 353, in execute method(env) File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/HIVE/package/scripts/hive_metastore.py", line 55, in start refresh_yarn() File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/HIVE/package/scripts/hive.py", line 401, in refresh_yarn Execute("yarn rmadmin -refreshSuperUserGroupsConfiguration", user = params.yarn_user) File "/usr/lib/ambari-agent/lib/resource_management/core/base.py", line 166, in __init__ self.env.run() File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 160, in run self.run_action(resource, action) File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 124, in run_action provider_action() File "/usr/lib/ambari-agent/lib/resource_management/core/providers/system.py", line 263, in action_run returns=self.resource.returns) File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 72, in inner result = function(command, **kwargs) File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 102, in checked_call tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy, returns=returns) File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 150, in _call_wrapper result = _call(command, **kwargs_copy) File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 314, in _call raise ExecutionFailed(err_msg, code, out, err) resource_management.core.exceptions.ExecutionFailed: Execution of 'yarn rmadmin -refreshSuperUserGroupsConfiguration' returned 255. 18/08/02 16:36:01 INFO client.RMProxy: Connecting to ResourceManager at vm-097/10.100.1.161:8141 18/08/02 16:36:02 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active! at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485) at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163) at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) , while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 1 failover attempts. Trying to failover after sleeping for 20085ms. 18/08/02 16:36:22 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active! at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485) at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163) at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) , while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 2 failover attempts. Trying to failover after sleeping for 25094ms. 18/08/02 16:36:47 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active! at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485) at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163) at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) , while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 3 failover attempts. Trying to failover after sleeping for 16001ms. 18/08/02 16:37:03 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active! at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485) at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163) at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) , while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 4 failover attempts. Trying to failover after sleeping for 20361ms. 18/08/02 16:37:23 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active! at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485) at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163) at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) , while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 5 failover attempts. Trying to failover after sleeping for 31694ms. 18/08/02 16:37:55 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active! at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485) at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163) at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) , while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 6 failover attempts. Trying to failover after sleeping for 32062ms. 18/08/02 16:38:27 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active! at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485) at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163) at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) , while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 7 failover attempts. Trying to failover after sleeping for 15377ms. 18/08/02 16:38:42 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active! at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485) at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163) at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) , while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 8 failover attempts. Trying to failover after sleeping for 26500ms. 18/08/02 16:39:09 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active! at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485) at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163) at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) , while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 9 failover attempts. Trying to failover after sleeping for 26405ms. 18/08/02 16:39:35 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active! at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485) at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163) at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) , while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 10 failover attempts. Trying to failover after sleeping for 15172ms. 18/08/02 16:39:51 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active! at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485) at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163) at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) , while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 11 failover attempts. Trying to failover after sleeping for 27700ms. 18/08/02 16:40:18 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active! at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485) at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163) at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) , while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 12 failover attempts. Trying to failover after sleeping for 39587ms. 18/08/02 16:40:58 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active! at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485) at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163) at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) , while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 13 failover attempts. Trying to failover after sleeping for 19571ms. 18/08/02 16:41:18 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active! at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485) at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163) at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) , while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 14 failover attempts. Trying to failover after sleeping for 17980ms. 18/08/02 16:41:35 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active! at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485) at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163) at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) , while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 15 failover attempts. Trying to failover after sleeping for 25732ms. 18/08/02 16:42:01 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active! at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485) at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163) at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) , while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 16 failover attempts. Trying to failover after sleeping for 28892ms. 18/08/02 16:42:30 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active! at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485) at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163) at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) , while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 17 failover attempts. Trying to failover after sleeping for 32208ms. 18/08/02 16:43:02 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active! at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485) at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163) at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) , while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 18 failover attempts. Trying to failover after sleeping for 31339ms. 18/08/02 16:43:34 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active! at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485) at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163) at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) , while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 19 failover attempts. Trying to failover after sleeping for 17716ms. 18/08/02 16:43:51 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active! at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485) at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163) at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) , while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 20 failover attempts. Trying to failover after sleeping for 39465ms. 18/08/02 16:44:31 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active! at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485) at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163) at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) , while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 21 failover attempts. Trying to failover after sleeping for 27786ms. 18/08/02 16:44:59 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active! at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485) at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163) at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) , while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 22 failover attempts. Trying to failover after sleeping for 39978ms. 18/08/02 16:45:39 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active! at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485) at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163) at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) , while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 23 failover attempts. Trying to failover after sleeping for 17613ms. 18/08/02 16:45:56 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active! at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485) at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163) at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) , while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 24 failover attempts. Trying to failover after sleeping for 23792ms. 18/08/02 16:46:20 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active! at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485) at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163) at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) , while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 25 failover attempts. Trying to failover after sleeping for 16330ms. 18/08/02 16:46:36 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active! at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485) at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163) at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) , while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 26 failover attempts. Trying to failover after sleeping for 44810ms. 18/08/02 16:47:21 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active! at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485) at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163) at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) , while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 27 failover attempts. Trying to failover after sleeping for 33074ms. 18/08/02 16:47:54 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active! at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485) at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163) at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) , while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 28 failover attempts. Trying to failover after sleeping for 21553ms. 18/08/02 16:48:16 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active! at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485) at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163) at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) , while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 29 failover attempts. Trying to failover after sleeping for 41820ms. refreshSuperUserGroupsConfiguration: ResourceManager null is not Active! at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485) at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163) at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) stdout: 2018-08-02 16:35:57,445 - Stack Feature Version Info: Cluster Stack=3.0, Command Stack=None, Command Version=3.0.0.0-1634 -> 3.0.0.0-1634 2018-08-02 16:35:57,472 - Using hadoop conf dir: /usr/hdp/3.0.0.0-1634/hadoop/conf 2018-08-02 16:35:57,817 - Stack Feature Version Info: Cluster Stack=3.0, Command Stack=None, Command Version=3.0.0.0-1634 -> 3.0.0.0-1634 2018-08-02 16:35:57,826 - Using hadoop conf dir: /usr/hdp/3.0.0.0-1634/hadoop/conf 2018-08-02 16:35:57,828 - Group['livy'] {} 2018-08-02 16:35:57,829 - Group['spark'] {} 2018-08-02 16:35:57,830 - Group['hdfs'] {} 2018-08-02 16:35:57,830 - Group['hadoop'] {} 2018-08-02 16:35:57,830 - Group['users'] {} 2018-08-02 16:35:57,831 - User['hive'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None} 2018-08-02 16:35:57,832 - User['yarn-ats'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None} 2018-08-02 16:35:57,834 - User['livy'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['livy', 'hadoop'], 'uid': None} 2018-08-02 16:35:57,835 - User['zookeeper'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None} 2018-08-02 16:35:57,836 - User['spark'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['spark', 'hadoop'], 'uid': None} 2018-08-02 16:35:57,837 - User['ambari-qa'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop', 'users'], 'uid': None} 2018-08-02 16:35:57,838 - User['tez'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop', 'users'], 'uid': None} 2018-08-02 16:35:57,839 - User['hdfs'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hdfs', 'hadoop'], 'uid': None} 2018-08-02 16:35:57,841 - User['yarn'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None} 2018-08-02 16:35:57,842 - User['mapred'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None} 2018-08-02 16:35:57,843 - File['/var/lib/ambari-agent/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555} 2018-08-02 16:35:57,845 - Execute['/var/lib/ambari-agent/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa 0'] {'not_if': '(test $(id -u ambari-qa) -gt 1000) || (false)'} 2018-08-02 16:35:57,854 - Skipping Execute['/var/lib/ambari-agent/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa 0'] due to not_if 2018-08-02 16:35:57,855 - Group['hdfs'] {} 2018-08-02 16:35:57,855 - User['hdfs'] {'fetch_nonlocal_groups': True, 'groups': ['hdfs', 'hadoop', u'hdfs']} 2018-08-02 16:35:57,856 - FS Type: HDFS 2018-08-02 16:35:57,856 - Directory['/etc/hadoop'] {'mode': 0755} 2018-08-02 16:35:57,883 - File['/usr/hdp/3.0.0.0-1634/hadoop/conf/hadoop-env.sh'] {'content': InlineTemplate(...), 'owner': 'hdfs', 'group': 'hadoop'} 2018-08-02 16:35:57,884 - Directory['/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir'] {'owner': 'hdfs', 'group': 'hadoop', 'mode': 01777} 2018-08-02 16:35:57,913 - Execute[('setenforce', '0')] {'not_if': '(! which getenforce ) || (which getenforce && getenforce | grep -q Disabled)', 'sudo': True, 'only_if': 'test -f /selinux/enforce'} 2018-08-02 16:35:57,927 - Skipping Execute[('setenforce', '0')] due to not_if 2018-08-02 16:35:57,928 - Directory['/var/log/hadoop'] {'owner': 'root', 'create_parents': True, 'group': 'hadoop', 'mode': 0775, 'cd_access': 'a'} 2018-08-02 16:35:57,931 - Directory['/var/run/hadoop'] {'owner': 'root', 'create_parents': True, 'group': 'root', 'cd_access': 'a'} 2018-08-02 16:35:57,932 - Changing owner for /var/run/hadoop from 1017 to root 2018-08-02 16:35:57,932 - Changing group for /var/run/hadoop from 1007 to root 2018-08-02 16:35:57,933 - Directory['/var/run/hadoop/hdfs'] {'owner': 'hdfs', 'cd_access': 'a'} 2018-08-02 16:35:57,934 - Directory['/tmp/hadoop-hdfs'] {'owner': 'hdfs', 'create_parents': True, 'cd_access': 'a'} 2018-08-02 16:35:57,941 - File['/usr/hdp/3.0.0.0-1634/hadoop/conf/commons-logging.properties'] {'content': Template('commons-logging.properties.j2'), 'owner': 'hdfs'} 2018-08-02 16:35:57,945 - File['/usr/hdp/3.0.0.0-1634/hadoop/conf/health_check'] {'content': Template('health_check.j2'), 'owner': 'hdfs'} 2018-08-02 16:35:57,957 - File['/usr/hdp/3.0.0.0-1634/hadoop/conf/log4j.properties'] {'content': InlineTemplate(...), 'owner': 'hdfs', 'group': 'hadoop', 'mode': 0644} 2018-08-02 16:35:57,977 - File['/usr/hdp/3.0.0.0-1634/hadoop/conf/hadoop-metrics2.properties'] {'content': InlineTemplate(...), 'owner': 'hdfs', 'group': 'hadoop'} 2018-08-02 16:35:57,978 - File['/usr/hdp/3.0.0.0-1634/hadoop/conf/task-log4j.properties'] {'content': StaticFile('task-log4j.properties'), 'mode': 0755} 2018-08-02 16:35:57,979 - File['/usr/hdp/3.0.0.0-1634/hadoop/conf/configuration.xsl'] {'owner': 'hdfs', 'group': 'hadoop'} 2018-08-02 16:35:57,987 - File['/etc/hadoop/conf/topology_mappings.data'] {'owner': 'hdfs', 'content': Template('topology_mappings.data.j2'), 'only_if': 'test -d /etc/hadoop/conf', 'group': 'hadoop', 'mode': 0644} 2018-08-02 16:35:57,993 - File['/etc/hadoop/conf/topology_script.py'] {'content': StaticFile('topology_script.py'), 'only_if': 'test -d /etc/hadoop/conf', 'mode': 0755} 2018-08-02 16:35:57,999 - Skipping unlimited key JCE policy check and setup since it is not required 2018-08-02 16:35:58,507 - Using hadoop conf dir: /usr/hdp/3.0.0.0-1634/hadoop/conf 2018-08-02 16:35:58,526 - call['ambari-python-wrap /usr/bin/hdp-select status hive-server2'] {'timeout': 20} 2018-08-02 16:35:58,567 - call returned (0, 'hive-server2 - 3.0.0.0-1634') 2018-08-02 16:35:58,569 - Stack Feature Version Info: Cluster Stack=3.0, Command Stack=None, Command Version=3.0.0.0-1634 -> 3.0.0.0-1634 2018-08-02 16:35:58,609 - File['/var/lib/ambari-agent/cred/lib/CredentialUtil.jar'] {'content': DownloadSource('http://vm-097:8080/resources/CredentialUtil.jar'), 'mode': 0755} 2018-08-02 16:35:58,611 - Not downloading the file from http://vm-097:8080/resources/CredentialUtil.jar, because /var/lib/ambari-agent/tmp/CredentialUtil.jar already exists 2018-08-02 16:36:00,100 - Execute['yarn rmadmin -refreshSuperUserGroupsConfiguration'] {'user': 'yarn'} Command failed after 1 tries
Attachment 2: `yarn rmadmin` output on `vm-100`
yarn@vm-100:~$ yarn rmadmin -refreshSuperUserGroupsConfiguration 18/08/02 17:38:04 INFO client.RMProxy: Connecting to ResourceManager at vm-097/10.100.1.161:8141 18/08/02 17:38:05 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active! at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485) at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163) at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) , while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 1 failover attempts. Trying to failover after sleeping for 19136ms. ^C
Attachment 3: `/etc/hosts` content on `vm-100`
yarn@vm-100:~$ cat /etc/hosts 127.0.0.1 localhost 127.0.1.1 vm-100 # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback ff02::1 ip6-allnodes ff02::2 ip6-allrouters 10.100.1.161 vm-097 10.100.1.162 vm-100 10.100.1.163 vm-136 10.100.1.164 vm-137 10.100.1.165 vm-138
Created 08-03-2018 02:51 AM
The issue looks to be due to Yarn (on node 10.100.1.161) not being reachable at port 8141.
There could be few possibilities here:
1. Yarn process is not running (seems unlikely from the details you provided).
2. Port 8141 has some issues.
3. Node 10.100.1.161 is not reachable to other nodes in the cluster.
Can you verify following on node 10.100.1.161:
# netstat -plan | grep 8141
# ps -ef | grep resourcemanager
# telnet 10.100.1.161 8141
Check the firewall as well.
Created 08-03-2018 03:48 AM
Thanks @Ravi for your replies.
Here are the outputs from machine 10.100.1.161:
yarn@vm-097:~$ netstat -plan | grep 8141 (Not all processes could be identified, non-owned process info will not be shown, you would have to be root to see it all.) tcp 0 0 0.0.0.0:8141 0.0.0.0:* LISTEN 28581/java
yarn@vm-097:~$ ps -ef | grep resourcemanager yarn 24639 24510 0 20:37 pts/0 00:00:00 grep --color=auto resourcemanager yarn 28581 1 1 17:19 ? 00:02:07 /usr/jdk64/jdk1.8.0_112/bin/java -Dproc_resourcemanager -Dhdp.version=3.0.0.0-1634 -Djava.net.preferIPv4Stack=true -Dhdp.version=3.0.0.0-1634 -Dyarn.id.str= -Dyarn.policy.file=hadoop-policy.xml -Djava.io.tmpdir=/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir -Dservice.libdir=/usr/hdp/3.0.0.0-1634/hadoop-yarn/./,/usr/hdp/3.0.0.0-1634/hadoop-yarn/lib,/usr/hdp/3.0.0.0-1634/hadoop-hdfs/./,/usr/hdp/3.0.0.0-1634/hadoop-hdfs/lib,/usr/hdp/3.0.0.0-1634/hadoop/./,/usr/hdp/3.0.0.0-1634/hadoop/lib -Dyarn.server.resourcemanager.appsummary.logger=INFO,RMSUMMARY -Dyarn.server.resourcemanager.appsummary.logger=INFO,RMSUMMARY -Drm.audit.logger=INFO,RMAUDIT -Dyarn.log.dir=/var/log/hadoop-yarn/yarn -Dyarn.log.file=hadoop-yarn-resourcemanager-vm-097.log -Dyarn.home.dir=/usr/hdp/3.0.0.0-1634/hadoop-yarn -Dyarn.root.logger=INFO,console -Djava.library.path=:/usr/hdp/3.0.0.0-1634/hadoop/lib/native/Linux-amd64-64:/usr/hdp/3.0.0.0-1634/hadoop/lib/native/Linux-amd64-64:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir:/usr/hdp/3.0.0.0-1634/hadoop/lib/native -Xmx1024m -Dhadoop.log.dir=/var/log/hadoop-yarn/yarn -Dhadoop.log.file=hadoop-yarn-resourcemanager-vm-097.log -Dhadoop.home.dir=/usr/hdp/3.0.0.0-1634/hadoop -Dhadoop.id.str=yarn -Dhadoop.root.logger=INFO,RFA -Dhadoop.policy.file=hadoop-policy.xml -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.yarn.server.resourcemanager.ResourceManager
yarn@vm-097:~$ telnet 10.100.1.161 8141 Trying 10.100.1.161... Connected to 10.100.1.161. Escape character is '^]'.
I did not specify anything about HA during installation. According to Ambari's YARN configs web UI, `yarn.resourcemanager.ha.enabled` has value `false`. Could this be the reason?
Created 08-03-2018 02:59 AM
Also, let me know following:
1. Have you enabled this cluster with ResourceManager HA?
2. If not please check if the value for the property yarn.resourcemanager.ha.enabled is set to true?
Created 08-03-2018 06:01 AM
What is the HDP and Ambari Version?
Kindly provide the rpm -qa output for Ambari, HDFS, YARN, HIVE.
Would it be possible for you to upload the blueprint configuration?
Created 08-03-2018 07:03 AM
My HDP version is HDP-3.0.0.0 (3.0.0.0-1634)
Ambari Version is 2.7.0.0
I don't have rpm on Ununtu, but `dpkg` outputs are:
yarn@vm-097:~$ dpkg -l | grep -i ambari ii ambari-agent 2.7.0.0-897 amd64 Ambari Agent ii ambari-infra-solr 2.7.0.0-897 amd64 [[description]] ii ambari-infra-solr-client 2.7.0.0-897 amd64 [[description]] ii ambari-metrics-assembly 2.7.0.0-897 amd64 Ambari Metrics Assembly ii ambari-server 2.7.0.0-897 amd64 Ambari Server
yarn@vm-097:~$ dpkg -l | grep -i hdfs ii hadoop-3-0-0-0-1634-hdfs 3.1.0.3.0.0.0-1634 all The Hadoop Distributed File System ii hadoop-3-0-0-0-1634-hdfs-datanode 3.1.0.3.0.0.0-1634 all Hadoop Data Node ii hadoop-3-0-0-0-1634-hdfs-journalnode 3.1.0.3.0.0.0-1634 all Hadoop HDFS JournalNode ii hadoop-3-0-0-0-1634-hdfs-namenode 3.1.0.3.0.0.0-1634 all The Hadoop namenode manages the block locations of HDFS files ii hadoop-3-0-0-0-1634-hdfs-secondarynamenode 3.1.0.3.0.0.0-1634 all Hadoop Secondary namenode ii hadoop-3-0-0-0-1634-hdfs-zkfc 3.1.0.3.0.0.0-1634 all Hadoop HDFS failover controller ii libhdfs0-3-0-0-0-1634 3.1.0.3.0.0.0-1634 amd64 Hadoop Filesystem Library ii ranger-3-0-0-0-1634-hdfs-plugin 1.1.0.3.0.0.0-1634 all Ranger HDFS plugin component runs within namenode to provoide enterprise security using ranger framework ii sqoop-3-0-0-0-1634 1.4.7.3.0.0.0-1634 all Sqoop allows easy imports and exports of data sets between databases and the Hadoop Distributed File System (HDFS).
yarn@vm-097:~$ dpkg -l | grep -i yarn ii atlas-metadata-3-0-0-0-1634 1.0.0.3.0.0.0-1634 all Atlas is an application framework which allows for a complex directed-acyclic-graph of tasks for processing data and is built atop Apache Hadoop YARN. ii hadoop-3-0-0-0-1634-yarn 3.1.0.3.0.0.0-1634 all The Hadoop NextGen MapReduce (YARN) ii livy2-3-0-0-0-1634 0.5.0.3.0.0.0-1634 all Livy is an open source REST interface for interacting with Spark2 from anywhere. It supports executing snippets of code or programs in a Spark2 context that runs locally or in YARN. ii ranger-3-0-0-0-1634-yarn-plugin 1.1.0.3.0.0.0-1634 all Ranger yarn plugin component runs within namenode to provide enterprise security using ranger framework ii spark2-3-0-0-0-1634-yarn-shuffle 2.3.1.3.0.0.0-1634 all Spark Yarn Shuffle jar
yarn@vm-097:~$ dpkg -l | grep -i hive ii atlas-metadata-3-0-0-0-1634-hive-plugin 1.0.0.3.0.0.0-1634 all Atlas Hive plugin component runs with hive using HIVE_AUX_JARS_PATH=/hook/hive ii cpio 2.11+dfsg-5ubuntu1 amd64 GNU cpio -- a program to manage archives of files ii hive-3-0-0-0-1634 3.1.0.3.0.0.0-1634 all Hive is a data warehouse infrastructure built on top of Hadoop ii hive-3-0-0-0-1634-hcatalog 3.1.0.3.0.0.0-1634 all Apache Hcatalog is a data warehouse infrastructure built on top of Hadoop ii hive-3-0-0-0-1634-jdbc 3.1.0.3.0.0.0-1634 all Provides libraries necessary to connect to Apache Hive via JDBC ii hive-warehouse-connector-3-0-0-0-1634 1.0.0.3.0.0.0-1634 all A library to load data into Apache Spark™ SQL DataFrames from ii oozie-3-0-0-0-1634-sharelib-hive 4.3.1.3.0.0.0-1634 all hive shared libraries for oozie workflow engine ii oozie-3-0-0-0-1634-sharelib-hive2 4.3.1.3.0.0.0-1634 all hive2 shared libraries for oozie workflow engine ii ranger-3-0-0-0-1634-hive-plugin 1.1.0.3.0.0.0-1634 all Ranger Hive plugin component runs within hiveserver2 to provoide enterprise security using ranger framework ii ubuntu-keyring 2012.05.19 all GnuPG keys of the Ubuntu archive ii unzip 6.0-20ubuntu1 amd64 De-archiver for .zip files ii zip 3.0-11 amd64 Archiver for .zip files
Created 08-03-2018 07:07 AM
And here is a blueprint.json
Created 08-03-2018 07:48 AM
In HDP3.0 the hive database MUST not be on the same as the Ambari database. Can you do the following on a host that is not housing the ambari databases.
# yum install -y mysql-server # chkconfig mysqld --level 345 on # yum install -y mysql-connector-java # Harden mysql server mysql_secure_installation
Then as root user assuming the password is "welcome1" and the hive user, database, and password is "hive"
mysql -u root -pwelcome1 CREATE USER 'hive'@'localhost' IDENTIFIED BY 'hive'; GRANT ALL PRIVILEGES ON *.* TO 'hive'@'localhost'; CREATE USER 'hive'@'%' IDENTIFIED BY 'hive'; GRANT ALL PRIVILEGES ON *.* TO 'hive'@'%'; GRANT ALL PRIVILEGES ON *.* TO 'hive'@'localhost' WITH GRANT OPTION; GRANT ALL PRIVILEGES ON *.* TO 'hive'@'%' WITH GRANT OPTION; FLUSH PRIVILEGES; quit;
Then login as hive
mysql -u hive -phive create database hive; show databases; quit;
See the attached Guozhen Li.jpg test the connection it MUSTsucceed to be able to start metastore. That should work
Created 08-03-2018 04:49 PM
I did put the hive database on a different host. My ambari-server is running on `vm-097`, and the hive database on `vm-100`. I am using Postgres (version 10) for hive database though. I did the creating database and granting all privileges too in the postgres database. The "test connection" in Ambari says this connection is ok. The hive database looks like this:
hive@vm-100:~$ psql psql (10.4 (Ubuntu 10.4-2.pgdg16.04+1)) Type "help" for help.
hive=> \l List of databases Name | Owner | Encoding | Collate | Ctype | Access privileges -----------+----------+----------+-------------+-------------+----------------------- hive | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =Tc/postgres + | | | | | postgres=CTc/postgres+ | | | | | hive=CTc/postgres postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres + | | | | | postgres=CTc/postgres template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | postgres=CTc/postgres+ | | | | | =c/postgres (4 rows)
hive=> \c hive You are now connected to database "hive" as user "hive".
hive=> \dt * List of relations Schema | Name | Type | Owner ------------+-------------------------+-------+---------- pg_catalog | pg_aggregate | table | postgres pg_catalog | pg_am | table | postgres pg_catalog | pg_amop | table | postgres pg_catalog | pg_amproc | table | postgres pg_catalog | pg_attrdef | table | postgres pg_catalog | pg_attribute | table | postgres pg_catalog | pg_auth_members | table | postgres pg_catalog | pg_authid | table | postgres pg_catalog | pg_cast | table | postgres pg_catalog | pg_class | table | postgres pg_catalog | pg_collation | table | postgres pg_catalog | pg_constraint | table | postgres pg_catalog | pg_conversion | table | postgres pg_catalog | pg_database | table | postgres pg_catalog | pg_db_role_setting | table | postgres pg_catalog | pg_default_acl | table | postgres pg_catalog | pg_depend | table | postgres pg_catalog | pg_description | table | postgres pg_catalog | pg_enum | table | postgres pg_catalog | pg_event_trigger | table | postgres pg_catalog | pg_extension | table | postgres pg_catalog | pg_foreign_data_wrapper | table | postgres pg_catalog | pg_foreign_server | table | postgres pg_catalog | pg_foreign_table | table | postgres pg_catalog | pg_index | table | postgres pg_catalog | pg_inherits | table | postgres pg_catalog | pg_init_privs | table | postgres pg_catalog | pg_language | table | postgres pg_catalog | pg_largeobject | table | postgres pg_catalog | pg_largeobject_metadata | table | postgres pg_catalog | pg_namespace | table | postgres pg_catalog | pg_opclass | table | postgres pg_catalog | pg_operator | table | postgres pg_catalog | pg_opfamily | table | postgres pg_catalog | pg_partitioned_table | table | postgres pg_catalog | pg_pltemplate | table | postgres pg_catalog | pg_policy | table | postgres pg_catalog | pg_proc | table | postgres pg_catalog | pg_publication | table | postgres pg_catalog | pg_publication_rel | table | postgres pg_catalog | pg_range | table | postgres pg_catalog | pg_replication_origin | table | postgres pg_catalog | pg_rewrite | table | postgres pg_catalog | pg_seclabel | table | postgres pg_catalog | pg_sequence | table | postgres pg_catalog | pg_shdepend | table | postgres pg_catalog | pg_shdescription | table | postgres pg_catalog | pg_shseclabel | table | postgres pg_catalog | pg_statistic | table | postgres pg_catalog | pg_statistic_ext | table | postgres pg_catalog | pg_subscription | table | postgres pg_catalog | pg_subscription_rel | table | postgres pg_catalog | pg_tablespace | table | postgres pg_catalog | pg_transform | table | postgres pg_catalog | pg_trigger | table | postgres pg_catalog | pg_ts_config | table | postgres pg_catalog | pg_ts_config_map | table | postgres pg_catalog | pg_ts_dict | table | postgres pg_catalog | pg_ts_parser | table | postgres pg_catalog | pg_ts_template | table | postgres pg_catalog | pg_type | table | postgres pg_catalog | pg_user_mapping | table | postgres (62 rows)
Created 08-04-2018 05:02 AM
To answer my own question:
I resolved this issue by enabling ResourceManager High Availability (HA).
The steps are described here: https://docs.hortonworks.com/HDPDocuments/Ambari-2.7.0.0/managing-high-availability/content/amb_enab...
I do not understand why it works, though. Hope someone can explain more.