Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Timeline Server and Resource Manager keep crashing


Timeline Server and Resource Manager keep crashing


I don't know if these two issues are related, but I can't seem to keep both running at the same time.

If I start TL server and RM, both will crash. If I just start RM, it'll run for about 30 minutes, then crash. If I just start TL server, it'll keep running until I try to start RM, then both will crash eventually.

Here's the error log when TL server doesn't start:

2018-05-30 16:17:43,163 INFO  rmapp.RMAppImpl ( - application_1526956266937_2575 State change from NEW to KILLED

2018-05-30 16:17:43,163 WARN  rmapp.RMAppImpl (<init>(423)) - The specific max attempts: 0 for application: 2576 is invalid, because it is out of the range [1, 2]. Use the global max attempts instead.

2018-05-30 16:17:43,163 INFO  rmapp.RMAppImpl ( - Recovering app: application_1526956266937_2576 with 1 attempts and final state = KILLED  

File "/usr/lib/python2.6/site-packages/resource_management/core/", line 303, in _call
    raise ExecutionFailed(err_msg, code, out, err)
resource_management.core.exceptions.ExecutionFailed: Execution of '  -H -E test -f /var/run/hadoop-yarn/yarn/ &&  -H -E pgrep -F /var/run/hadoop-yarn/yarn/' returned 1.

And here's what a small snippet of what I think is the problem in the yarn-yarn-resourcemanager<server>.log file

2018-05-30 16:17:48,159 INFO  rmapp.RMAppImpl ( - application_1527213246626_10169 State change from NEW to FAILED
2018-05-30 16:17:48,159 INFO  rmapp.RMAppImpl ( - Recovering app: application_1527213246626_10170 with 0 attempts and final state = FAILED
Don't have an account?
Coming from Hortonworks? Activate your account here