Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How can I get to know that the automatic component restart happened?

Solved Go to solution

How can I get to know that the automatic component restart happened?

Contributor

I'm using Ambari 2.6.2.2 and Ambari 2.6.2.2 has "Service Auto Start Configuration" that enables a component restarting when it went down unexpectedly.

However, I could not find an automatic "Restart" operation in the operations history when the automatic restart functionality worked.

How can I get to know that the automatic component restart happened?

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: How can I get to know that the automatic component restart happened?

Super Mentor

@Takefumi Oide

In Ambari UI operation log you can not see the operations that are performed by Ambari Internally via Agents. Only user performed operations (explicit operations) can be seen there.

The "/usr/lib/ambari-agent/lib/ambari_agent/RecoveryManager.py" is basically responsible for recovery of service components.

For example: When we kill AMS collector and if the Auto Restart is enable for this component then we can see the following kind of message in the Agent log to know if the "AUTO_EXECUTION_COMMAND" was performed.

# grep 'Adding recovery command START for component' /var/log/ambari-agent/ambari-agent.log
INFO 2018-10-09 06:33:52,324 Controller.py:410 - Adding recovery command START for component METRICS_COLLECTOR
.
.
INFO 2018-10-09 06:33:52,325 ActionQueue.py:113 - Adding AUTO_EXECUTION_COMMAND for role METRICS_COLLECTOR for service AMBARI_METRICS of cluster NewCluster to the queue.
.
INFO 2018-10-09 06:36:25,643 RecoveryManager.py:185 - current status is set to STARTED for METRICS_COLLECTOR

.

Or just grep that script:

# grep 'RecoveryManager.py'  /var/log/ambari-agent/ambari-agent.log
INFO 2018-10-09 06:33:52,310 RecoveryManager.py:255 - METRICS_COLLECTOR needs recovery, desired = STARTED, and current = INSTALLED.
INFO 2018-10-09 06:36:25,643 RecoveryManager.py:185 - current status is set to STARTED for METRICS_COLLECTOR

.

View solution in original post

2 REPLIES 2
Highlighted

Re: How can I get to know that the automatic component restart happened?

Super Mentor

@Takefumi Oide

In Ambari UI operation log you can not see the operations that are performed by Ambari Internally via Agents. Only user performed operations (explicit operations) can be seen there.

The "/usr/lib/ambari-agent/lib/ambari_agent/RecoveryManager.py" is basically responsible for recovery of service components.

For example: When we kill AMS collector and if the Auto Restart is enable for this component then we can see the following kind of message in the Agent log to know if the "AUTO_EXECUTION_COMMAND" was performed.

# grep 'Adding recovery command START for component' /var/log/ambari-agent/ambari-agent.log
INFO 2018-10-09 06:33:52,324 Controller.py:410 - Adding recovery command START for component METRICS_COLLECTOR
.
.
INFO 2018-10-09 06:33:52,325 ActionQueue.py:113 - Adding AUTO_EXECUTION_COMMAND for role METRICS_COLLECTOR for service AMBARI_METRICS of cluster NewCluster to the queue.
.
INFO 2018-10-09 06:36:25,643 RecoveryManager.py:185 - current status is set to STARTED for METRICS_COLLECTOR

.

Or just grep that script:

# grep 'RecoveryManager.py'  /var/log/ambari-agent/ambari-agent.log
INFO 2018-10-09 06:33:52,310 RecoveryManager.py:255 - METRICS_COLLECTOR needs recovery, desired = STARTED, and current = INSTALLED.
INFO 2018-10-09 06:36:25,643 RecoveryManager.py:185 - current status is set to STARTED for METRICS_COLLECTOR

.

View solution in original post

Highlighted

Re: How can I get to know that the automatic component restart happened?

Contributor
Don't have an account?
Coming from Hortonworks? Activate your account here