Created 08-15-2022 03:11 PM
Does anyone know how the cloudera-scm-agent service determines which services were running when it restarts after a system crash and reboot?
I would like to be able to prevent CDH services from automatically starting when the cloudera-scm-agent service starts after a system crash.
I've been looking at the log file for the cloudera-scm-agent but I didn't see any info level messages that would help me determine where the agent is getting information to determine which services were running before the crash.
I've also looked at tables in the database used by the cloudera-scm-server-db service but have not found anything obvious that appears to control which services are automatically started when the cloudera-scm-agent service starts (I thought this might be controlled by the 'configured_state' field in the services table but that does not seem to have any effect on which services are started).
Any help is greatly appreciated.
Created 08-16-2022 08:25 AM
Hi @PNW
If you want to disable the automatic restart of cloudera services visit the configuration page of the service you'd like to manage, then
search for "Automatically Restart Process". You should see this option
for each role within the service, then uncheck it.
Regards,
Chethan YM
Created 08-16-2022 11:28 AM
Created 12-12-2022 02:47 AM
Created 12-12-2022 12:41 PM
No, I was never able to determine how cloudera-scm-agent service or supervisord were determining which services were running at the time the system crashed. I did extensive testing and thought I had found some fields in the database used by Cloudera Management Services that might have been used to determine which service roles were running at the time of the crash but when I tested changing these fields to indicate that the service roles were not running those service roles still were started when I started up the cloudera-scm-agent service. Need someone who is familiar with the source code to tell us how this mechanism works (how does the code determine which services were running).
Created 12-29-2022 03:50 AM
@frank_albers @PNW supervisor has a listener process which keeps monitoring the CDH processes. For any service with auto restart configured if faces an unexpected exit , supervisord will restart the process.
Agent looks for unexpected exits in the notifications it receives via the supervisor listener and forwards relevant event info to the associated role's monitor to update its unexpected exit state.
A normal agent restart does not affect supervisord process, it continues to be running and managing CDH processes. A hard stop/restart on agent will affect supervisord and in turn kill all managed CDH processes.
Hope this helps,
Paras
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.