Support Questions
Find answers, ask questions, and share your expertise

Starting cloudera-scm-agent after next_start_clean deletes .pid file

New Contributor

During an investigation for cluster instabilities, I have noticed that is removed after a clean restart (on the CM5.7 / CentOS7 cluster).


(I used commands listed on the official document )


$ sudo service cloudera-scm-agent next_start_clean
$ sudo service cloudera-scm-agent start



Due to the lack of the pid file, stop command to the init script does not actually stop the running agent process, and when I try to start the agent process next time, the init script allows to run second agent process. I suspect that the duplication of the agent processes led to the cluster instability.


It occurs on a CM/CDH5.7 cluster, but doesn't occur on a CM/CDH5.5.2 cluster (both clusters are based on CentOS7).

I also have found that the default location of the is changed between these two versions, ( from /var/run/ to /var/run/cloudera-scm-agent/, in /etc/init.d/cloudera-scm-agent ). From the contents of /usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.7.0-py2.7.egg/cmf/, the latter path seems a target of the rmtree().


Is it a good idea to try to workaround this problem by changing /etc/init.d/cloudera-scm-agent as follows?


-- cloudera-scm-agent.orig 2016-05-06 17:06:11.558271813 +0900
+++ cloudera-scm-agent 2016-05-06 17:06:26.625264949 +0900
@@ -97,7 +97,7 @@
#pid file
# Marker files for working around systemd