Support Questions

Find answers, ask questions, and share your expertise

Prevent automatic restart of CDH services after a system crash

avatar
New Contributor

Does anyone know how the cloudera-scm-agent service determines which services were running when it restarts after a system crash and reboot?

I would like to be able to prevent CDH services from automatically starting when the cloudera-scm-agent service starts after a system crash.

I've been looking at the log file for the cloudera-scm-agent but I didn't see any info level messages that would help me determine where the agent is getting information to determine which services were running before the crash.

I've also looked at tables in the database used by the cloudera-scm-server-db service but have not found anything obvious that appears to control which services are automatically started when the cloudera-scm-agent service starts (I thought this might be controlled by the 'configured_state' field in the services table but that does not seem to have any effect on which services are started).

Any help is greatly appreciated.

5 REPLIES 5

avatar
Master Collaborator

Hi @PNW 

 

If you want to disable the automatic restart of cloudera services visit the configuration page of the service you'd like to manage, then
search for "Automatically Restart Process". You should see this option
for each role within the service, then uncheck it.

 

Regards,

Chethan YM

avatar
New Contributor
I don’t want to disable automatic restart of processes associated with CDH services under normal conditions. I only want to disable the automatic restart when cloudera-scm-agent starts if the host OS crashes and reboots.
I’ve disabled automatic startup of the cloudera-scm-server and cloudera-scm-agent services after a system boot but cloudera-scm-agent/supervisord still automatically start processes for CDH services that were running at the time of the system crash when I start the cloudera-scm-agent service.
I’m trying to determine how the cloudera-scm-agent service is determining the state of each service/service role at the time of the crash. I would expect the information to be derived from some data in the database used by the cloudera-scm-server-db service or in a file or files in the local file system, but I've been unable to find the location where this information is stored.

avatar
New Contributor

Hi @PNW 

 

Did you find any solution for the issues described here. I need the same option.

avatar
New Contributor

No, I was never able to determine how cloudera-scm-agent service or supervisord were determining which services were running at the time the system crashed.  I did extensive testing and thought I had found some fields in the database used by Cloudera Management Services that might have been used to determine which service roles were running at the time of the crash but when I tested changing these fields to indicate that the service roles were not running those service roles still were started when I started up the cloudera-scm-agent service. Need someone who is familiar with the source code to tell us how this mechanism works (how does the code determine which services were running).

avatar
Master Collaborator

@frank_albers @PNW supervisor has a listener process which keeps monitoring the CDH processes. For any service with auto restart configured if faces an unexpected exit , supervisord will restart the process.

Agent looks for unexpected exits in the notifications it receives via the supervisor listener and forwards relevant event info to the associated role's monitor to update its unexpected exit state.

A normal agent restart does not affect supervisord process, it continues to be running and managing CDH processes. A hard stop/restart on agent will affect supervisord and in turn kill all managed CDH processes.

 

Hope this helps,
Paras
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.