11-04-2016 05:53 AM
I've upgraded Cloudera Manager from 5.7.0 to 5.9. The upgrade was done smoothly without any error. But then I couldn't upgrade CDH to 5.9. The parcels distributed but the the services can't be started.
I realized that all clouder-scm-agent services are stopped. I went through the logs, and found that the agent can't communicate with supervisord. The error is Protocol Error: 401 Unauthorized.
I found there's a supervisord.conf file in /run/cloudera-scm-agent/supervisord . It seems that it is generated automatically by the agent process. In the file, it's mentioned that the server is on http://127.0.0.1:19001 and it requires username & password(which I found that they're generated with random numbers). The odd thing here is that I can't login the supervisorctl interface with those username and password, neither in the browser and nor with commandline. It seems that although there's a config file for supervisord, the process doesn't use it or there are other problems which prevents the supervisord process to accept those credentials.
I even tried to re-install the agent and daemon but the problem is still there.
Does anybody have an idea?
11-21-2016 02:36 PM
This is really unexpected. It seems that your supervisord processes are in a bad state for some reason.
Have you tried to restart them by any chance? Or did you try rebooting the host/os by any chance?
11-21-2016 03:15 PM
The agent starts the supervisor (if one is not already running) by using the /var/run/cloudera-scm-agent/supervisor/supervisor.conf configuration.
The same supervisord.conf username and password is used to authenticate. If authentication to the supervisor is failing, that indicates there is a supervisor running that was launched using a different supervisord.conf.
ps aux |grep supervisor
See if there is an existing supervisor process... you will likely need to kill it and then start the agent to make sure the agent is using the right supervisord.conf.