Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Problem with cloudera-scm-agent and supervisord

avatar
New Contributor

I've upgraded Cloudera Manager from 5.7.0 to 5.9. The upgrade was done smoothly without any error. But then I couldn't upgrade CDH to 5.9. The parcels distributed but the the services can't be started.

 

I realized that all clouder-scm-agent services are stopped. I went through the logs, and found that the agent can't communicate with supervisord. The error is Protocol Error: 401 Unauthorized.

 

I found there's a supervisord.conf file in /run/cloudera-scm-agent/supervisord . It seems that it is generated automatically by the agent process. In the file, it's mentioned that the server is on http://127.0.0.1:19001 and it requires username & password(which I found that they're generated with random numbers). The odd thing here is that I can't login the supervisorctl interface with those username and password, neither in the browser and nor with commandline. It seems that although there's a config file for supervisord, the process doesn't use it or there are other problems which prevents the supervisord process to accept those credentials.

 

I even tried to re-install the agent and daemon but the problem is still there.

 

Does anybody have an idea?

 

There are only 10 types of people in the world,
those who understand binary and those who don't...
3 REPLIES 3

avatar
Contributor

Hi WhiteWizard,

 

 

This is really unexpected. It seems that your supervisord processes are in a bad state for some reason.

Have you tried to restart them by any chance? Or did you try rebooting the host/os by any chance?

 

 

cheers,

zegab

avatar
Master Guru

The agent starts the supervisor (if one is not already running) by using the /var/run/cloudera-scm-agent/supervisor/supervisor.conf configuration.

 

The same supervisord.conf username and password is used to authenticate.  If authentication to the supervisor is failing, that indicates there is a supervisor running that was launched using a different supervisord.conf.

 

Run:

 

ps aux |grep supervisor

 

See if there is an existing supervisor process... you will likely need to kill it and then start the agent to make sure the agent is using the right supervisord.conf.

 

Regards,

 

Ben

avatar
Explorer
Problem resolved with your above update.