Created 11-29-2017 05:30 AM
i am having problems forcing the cloudera/hadoop procs to use the new
supervisord and assesing the impact of this not working.
i am upgrading from cdh5.6 to cdh5.13
did upgrade all the cloudera manager daemons and agents to 5.13
some nodes are not using the new supervisord, this is seen from the text below,
taken from the host inspector, it gives 3 groups using 3 different supervisord
group1:
Supervisord 3.0-cm5.13.0
group2L
Supervisord 3.0-cm5.6.0
group3:
Supervisord 3.0
i can move hosts to use the new supervisord, by killing the old process, kill $(cat /run/cloudera-scm-agent/supervisord/supervisord.pid)
this works, something starts a new, korrekt version, of supervisord.
this seems like a not-nice way to do it, and causes some warning in the manager.
is there another way to do this?
also, what is the potential risks in not using the same supervisord version,
will the nodes be unable to communicate?
can i upgrade parcels?
Created on 06-19-2018 01:00 PM - edited 06-19-2018 01:01 PM
I hit this as well, and found that
service cloudera-scm-agent hard_restart
fixes this (does the same thing, restarts the supervisord). I have had mismatched supervisord numbers over a period of months with no apparent effect. This page:
https://www.cloudera.com/documentation/enterprise/5-7-x/topics/cm_ag_agents.html
Seems to address your question of how to restart/fix, but does not discuss the effect of mismatched supervisord versions.
Created 06-28-2018 04:06 AM
thanks, this is what worked in the end, coupled with kill -9 for the really resilient procs.