Created on 10-13-2017 12:00 PM - edited 09-16-2022 05:24 AM
Hello,
Today RM crashed and I am trying to find what caused this.
This is what I found in cloudera-scm-agent log:
[13/Oct/2017 17:07:47 +0000] 2807 MainThread agent INFO PID '8311' associated with process '1209-yarn-RESOURCEMANAGER' with payload 'processname:1209-yarn-RESOURCEMANAGER groupname:1209-yarn-RESOURCEMANAGER from_state:RUNNING expected:0 pid:8311' exited unexpectedly
In hadoop-yarn-resourcemanager log these are last entries before crash:
2017-10-13 17:06:33,271 WARN org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: The specific max attempts: 0 for application: 5188 is invalid, because it is out of the range [1, 2]. Use the global max attempts instead.
2017-10-13 17:06:33,292 WARN org.apache.hadoop.security.token.Token: No TokenRenewer defined for token kind HIVE_DELEGATION_TOKEN
2017-10-13 17:06:41,978 WARN org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: The specific max attempts: 0 for application: 5189 is invalid, because it is out of the range [1, 2]. Use the global max attempts instead.
2017-10-13 17:06:41,992 WARN org.apache.hadoop.security.token.Token: No TokenRenewer defined for token kind HIVE_DELEGATION_TOKEN
2017-10-13 17:06:47,080 WARN org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: The specific max attempts: 0 for application: 5190 is invalid, because it is out of the range [1, 2]. Use the global max attempts instead.
2017-10-13 17:06:47,094 WARN org.apache.hadoop.security.token.Token: No TokenRenewer defined for token kind HIVE_DELEGATION_TOKEN
2017-10-13 17:06:50,455 WARN org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: The specific max attempts: 0 for application: 5191 is invalid, because it is out of the range [1, 2]. Use the global max attempts instead.
2017-10-13 17:06:50,456 WARN org.apache.hadoop.security.token.Token: No TokenRenewer defined for token kind HIVE_DELEGATION_TOKEN
I wonder if anybody has some idea what else I can check to find root cause.
Also,
RM restarted without issues.
Thanks.
Created 10-13-2017 12:24 PM
I just found cause in syslog:
kernel: [31684060.806133] Out of memory: Kill process 8311 (java) score 154 or sacrifice child
kernel: [31684060.806282] Killed process 8311 (java) total-vm:2108548kB, anon-rss:1259808kB, file-rss:0kB
Created 10-13-2017 12:24 PM
I just found cause in syslog:
kernel: [31684060.806133] Out of memory: Kill process 8311 (java) score 154 or sacrifice child
kernel: [31684060.806282] Killed process 8311 (java) total-vm:2108548kB, anon-rss:1259808kB, file-rss:0kB