Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Does Ambari auto start handle host shutdowns involving Ambari server itself?

Highlighted

Does Ambari auto start handle host shutdowns involving Ambari server itself?

New Contributor

I am running Ambari 2.7.3, HDP 3.0.1, and HDF 3.3.0 on RHEL 7.x. I've configured Auto Start globally ("recovery_enabled": "true", "recovery_type": "AUTO_START"), for all services, and for all components in Ambari. I'm testing what circumstances services/components are automatically started.

It appears that:

  • Action: Kill the processes of a service, on a remote host (e.g. `killall -u nifi`)
    Result: the service is automatically restarted
  • Action: Shutdown/restart a remote host
    Result: services on host are restarted, shortly after boot
  • Action: Shutdown/restart the ambari-server host
    Result: ambari-server and ambari-agent start on boot. No services or components are restarted.

Is the third result expected? Is Auto Start meant to work when the Ambari server itself is restarted?

1 REPLY 1

Re: Does Ambari auto start handle host shutdowns involving Ambari server itself?

Hi @Alex Willmer,


No The Third Result is not exepcted, Ideally when you shutdown and restart the ambari-agent host , and ambari-agent was started the expectation is ambari-agent starting all the services automatically.


Can you please confirm what was the state of services of that host before the shutdown, if the state of service was already stopped ambari-agent will retain the state or else ambari-agent will try to start it.


you can know it by checking the ambari-server logs after start of ambari-agent, you will find some logs like this :


ambari-server/ambari-server.log:70978:14 Jan 2019 15:25:41,690 INFO [qtp-ambari-agent-1082725] HeartBeatHandler:464 - Recovery configuration set to RecoveryConfig{, type=AUTO_START, maxCount=6, windowInMinutes=60, retryGap=5, maxLifetimeCount=1024, components=METRICS_MONITOR,HBASE_REGIONSERVER,DATANODE,KAFKA_BROKER,LOGSEARCH_LOGFEEDER,HST_AGENT,SUPERVISOR,NODEMANAGER, recoveryTimestamp=1547450741689} 

also something like this in ambari-agent


HW15603:ambari-agent asnaik$ grep -nir 'RecoveryManager' * |grep -i 'current status is set to STARTED' 
ambari-agent.log:48593:INFO 2019-01-14 15:25:47,412 RecoveryManager.py:185 - current status is set to STARTED for HST_AGENT 
ambari-agent.log:48751:INFO 2019-01-14 15:27:15,042 RecoveryManager.py:185 - current status is set to STARTED for METRICS_MONITOR 
ambari-agent.log:48757:INFO 2019-01-14 15:27:34,098 RecoveryManager.py:185 - current status is set to STARTED for HBASE_REGIONSERVER 
ambari-agent.log:48851:INFO 2019-01-14 15:28:29,509 RecoveryManager.py:185 - current status is set to STARTED for LOGSEARCH_LOGFEEDER 
ambari-agent.log:48899:INFO 2019-01-14 15:29:06,880 RecoveryManager.py:185 - current status is set to STARTED for NODEMANAGER 


Hop this helps you troubleshoot more on this.