Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Can't restart the ressourcemanager on node


Can't restart the ressourcemanager on node



I am new with YARN, I am not able to restart the nodemanagers on some of nodes. Could you please help me out?


Re: Can't restart the ressourcemanager on node



Can you share your HDP setup? HDP Version, number of nodes, HA or not, Kerberized or not and share the exact error being thrown preferably logs !

Typically a node manager should run on the same node as the data node.


Re: Can't restart the ressourcemanager on node


The HDP Version is with 16 nodes 4 are worker nodes and the environment is kerberized. The node wn03 needed to be rebooted by the infra for maintenance purpose. I stopped all the services on that worker node and restarted all the services after rebooting the worker machine wn03.

This morning i get the following warning in ambari:

Connection failed to (Execution of 'curl --location-trusted -k --negotiate -u : -b /var/lib/ambari-agent/tmp/cookies/8d14b5f3-5456-4599-9510-0036effff91d -c /var/lib/ambari-agent/tmp/cookies/8d14b5f3-5456-4599-9510-0036effff91d -w '%{http_code}' --connect-timeout 5 --max-time 7 -o /dev/null 1>/tmp/tmpGaZLZ4 2>/tmp/tmpag2rxg' returned 7. % Total % Received % Xferd Average Speed Time Time Time Current

Dload Upload Total Spent Left Speed

0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0curl: (7) Failed connect to; Connection refused


Re: Can't restart the ressourcemanager on node

Super Mentor


As the error says 


Connection failed to <a href="" target="_blank"></a>



Hence can you please check the host wn03 first to verify if the port 8044 is listening?  Also please check if the firewall is disabled .. just to ensure that the mentioned port can be accessed remotely.


# netstat -tnlpa | grep 8044


If the port is not listening then it is obvious that the connection can b=not be established to it via curl (as ambari is attempting)
Int hat case please check the NodeManger log from that host "wn03" to find out if it is showing any error also please check and share the ResourceManager log