02-26-2018 08:01 AM
Is there any way to restart a service which is managed by Cloudera Manager, when the node hosting CM is unavailable?
If CDH is installed without CM it should be as simple as "$service <service_name> restart" but services such as Oozie do not seem to be registered when CM is used on the cluster.
The reason for wanting this is that I would like to be able to change a parameter in the Oozie setting, and deploy the new setting, even if Cloudera Manager is unavailable.
Thanks in advance
02-26-2018 08:19 AM
The oozie service is usually on the server of the CM server, usually in /etc/ini.d/oozie.
You can use the "service oozie restart" command as you say to start it.
You can also try the option to "inspect all the hosts" in case there is any problem.
02-26-2018 09:17 AM
Thanks for the reply!
On my clusters Oozie and CM installations are not located on the same host. I have multiple clusters, both running on dedicated hardware, and on AWS. All are running CDH 5.4.x or 5.5.x and have the same problem, regarding missing service entries.
When trying "$service oozie restart" I simply get "oozie: unrecognized service". Using "$service --status-all" gives a list of services, on which none of my CDH services are mentioned.
I am not sure what you are referring to with "inspect all the hosts". There are only 6 nodes in my cluster, so manually checking per-node is not a problem.
02-28-2018 01:02 AM - edited 02-28-2018 01:04 AM
The "best" way for you might be to configure the "auto-restart" option in Cloudera Manager for all services. Then you can just kill (nicelly) the service you need and wait for cloudera agent to start it again.
02-28-2018 01:13 AM
Okay, I found the "inspect all nodes" function. I am not sure what to use it for. All nodes are in good condition.
I am looking for terminal control of Oozie because I would like to test the system while the CM host is down.
Also, there is no /etc/ini.d directory in my cluster. There are /etc/init and /etc/init.d but none of them contain Oozie-related files or directories.
02-28-2018 01:20 AM
The correct folder is "/etc/init.d", but when cluster is managed by Cloudera Manager there should not be any file under this one. If it exists, do not try to use it with "system restart ..." as you will mess things up. Only cloudera agent (supervisord) should handle the services.
Otherwise, just drop out Cloudera Manager and run the cluster unmanaged.
I don't know if cloudera agent provides API (haven't looked for it) to send commands, normally is shouldn't as it is handled by Cloudera Manager which is responsible to authenticate the user and if he is authorized to do a restart.
Please try the "kill" with auto restart.
02-28-2018 02:08 AM
This sounds like a possible option, if the auto-restart only relies on the agent, and not the CM server.
And also assuming I can kill oozie while CM is down. As stated, "service oozie stop" won't work, since oozie isn't registered as a service on the host. I guess it should be possible to find the process somewhere, but killing it that way doesn't seem all that "nice" to me.
The only issue is that I would like to have oozie connect to a different database after restarting.
As far as I can see the current service settings (like oozie-site.xml) are kept in subdirectories of /var/run/cloudera-scm-agent/process but manually changing those files seems like a pretty sketchy thing to start doing.
Do you have any experience on this type of thing?
02-28-2018 02:30 AM - edited 02-28-2018 06:41 AM
You are right. It is not a good idea to start messing with these files. Plus these files are temporary, meaning if you reboot the server, will be lost.
These directories are automatically created by cloudera agent with the name of 'Auto-increment'-'service'-'roletype'. Eg.
The cloudera agent will retrieve this information from Cloudera Manager server. There is no permanent local storage for configuration files.
Any such configuration change should be done using Cloudera Manager.
But if (for whatever reason) you need to do it, then yes you can kill it. Cloudera agent does not rely on communication with Cloudera Manager to do the automatic restart. And since there this restart is not issued by Cloudera Manager, it will restart the service on the same "/var/run/cloudera-scm-agent/process/service_dir" directory. If the command restart comes from the Cloudera Manager, then a new folder is created.
You should be carefull because this changes should be pushed to CM. If not, on a service restart by CM those changes will be lost.