- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Prevent automatic restart of CDH services after a system crash
- Labels:
-
Cloudera Essentials
-
Cloudera Manager
Created ‎08-15-2022 03:11 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Does anyone know how the cloudera-scm-agent service determines which services were running when it restarts after a system crash and reboot?
I would like to be able to prevent CDH services from automatically starting when the cloudera-scm-agent service starts after a system crash.
I've been looking at the log file for the cloudera-scm-agent but I didn't see any info level messages that would help me determine where the agent is getting information to determine which services were running before the crash.
I've also looked at tables in the database used by the cloudera-scm-server-db service but have not found anything obvious that appears to control which services are automatically started when the cloudera-scm-agent service starts (I thought this might be controlled by the 'configured_state' field in the services table but that does not seem to have any effect on which services are started).
Any help is greatly appreciated.
Created ‎08-16-2022 08:25 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @PNW
If you want to disable the automatic restart of cloudera services visit the configuration page of the service you'd like to manage, then
search for "Automatically Restart Process". You should see this option
for each role within the service, then uncheck it.
Regards,
Chethan YM
Created ‎08-16-2022 11:28 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I’ve disabled automatic startup of the cloudera-scm-server and cloudera-scm-agent services after a system boot but cloudera-scm-agent/supervisord still automatically start processes for CDH services that were running at the time of the system crash when I start the cloudera-scm-agent service.
I’m trying to determine how the cloudera-scm-agent service is determining the state of each service/service role at the time of the crash. I would expect the information to be derived from some data in the database used by the cloudera-scm-server-db service or in a file or files in the local file system, but I've been unable to find the location where this information is stored.
Created ‎12-12-2022 02:47 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎12-12-2022 12:41 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
No, I was never able to determine how cloudera-scm-agent service or supervisord were determining which services were running at the time the system crashed. I did extensive testing and thought I had found some fields in the database used by Cloudera Management Services that might have been used to determine which service roles were running at the time of the crash but when I tested changing these fields to indicate that the service roles were not running those service roles still were started when I started up the cloudera-scm-agent service. Need someone who is familiar with the source code to tell us how this mechanism works (how does the code determine which services were running).
Created ‎12-29-2022 03:50 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@frank_albers @PNW supervisor has a listener process which keeps monitoring the CDH processes. For any service with auto restart configured if faces an unexpected exit , supervisord will restart the process.
Agent looks for unexpected exits in the notifications it receives via the supervisor listener and forwards relevant event info to the associated role's monitor to update its unexpected exit state.
A normal agent restart does not affect supervisord process, it continues to be running and managing CDH processes. A hard stop/restart on agent will affect supervisord and in turn kill all managed CDH processes.
Hope this helps,
Paras
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
