Created 08-09-2018 10:07 PM
@Jay Kumar SenSharma maybe you can help me with this one instead?
I have a 4-node cluster. All four are datanodes and one node is also the resource-manager. My ambari installation only installed a node-manager on my master resource-manager node. Assuming this is correct (please let me know if it is not), I have been getting errors about my node-manager. It says the health is bad because it cannot connect:
Connection failed to http://ncienspk01.nciwin.local:8042/ws/v1/node/info (Traceback (most recent call last): File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/alerts/alert_nodemanager_health.py", line 171, in execute url_response = urllib2.urlopen(query, timeout=connection_timeout) File "/usr/lib64/python2.7/urllib2.py", line 154, in urlopen return opener.open(url, data, timeout) File "/usr/lib64/python2.7/urllib2.py", line 431, in open response = self._open(req, data) File "/usr/lib64/python2.7/urllib2.py", line 449, in _open '_open', req) File "/usr/lib64/python2.7/urllib2.py", line 409, in _call_chain result = func(*args) File "/usr/lib64/python2.7/urllib2.py", line 1244, in http_open return self.do_open(httplib.HTTPConnection, req) File "/usr/lib64/python2.7/urllib2.py", line 1214, in do_open raise URLError(err) URLError: <urlopen error [Errno 111] Connection refused> )
Many of my services had corrupt installs and I did a re-install. That may be the case here as well. Thoughts on how to re-install?
Also- should I have a node-manager on every node? If so how do I install them and connect them.
Thanks for your help! Dan
Created 08-10-2018 07:29 PM
@Jay Kumar SenSharma
When I try to start services now I'm getting:
For HDFS Client Install
RuntimeError: Failed to execute command '/usr/bin/yum -y install hadoop_3_0_0_0_1634', exited with code '1', message: 'Error unpacking rpm package hadoop_3_0_0_0_1634-3.1.0.3.0.0.0-1634.x86_64'
For Hive Client Install
RuntimeError: Failed to execute command '/usr/bin/yum -y install hive_3_0_0_0_1634-hcatalog', exited with code '1', message: 'Error unpacking rpm package hadoop_3_0_0_0_1634-3.1.0.3.0.0.0-1634.x86_64
Created 08-11-2018 12:05 AM
So I resolved all this. I just followed the steps here to remove all my packages, then deleted the contents of my files:
rm -rf /usr/hdp/Then in Ambari I used the "Start all Services" command and it went through and installed everything again for me.
/usr/hdp/3.0.0.0-.../spark2/aux/to all the other nodes in my cluster. Now all my nodemanagers are coming up and things are looking good.
Created 08-10-2018 07:29 PM
@Jay Kumar SenSharma
When I try to start services now I'm getting:
For HDFS Client Install
RuntimeError: Failed to execute command '/usr/bin/yum -y install hadoop_3_0_0_0_1634', exited with code '1', message: 'Error unpacking rpm package hadoop_3_0_0_0_1634-3.1.0.3.0.0.0-1634.x86_64'
For Hive Client Install
RuntimeError: Failed to execute command '/usr/bin/yum -y install hive_3_0_0_0_1634-hcatalog', exited with code '1', message: 'Error unpacking rpm package hadoop_3_0_0_0_1634-3.1.0.3.0.0.0-1634.x86_64
Created 08-10-2018 09:20 PM
@Jay Kumar SenSharma I'm definitely in a jam now. Really hoping you can help me. A bit scared to touch anything at this point.
Created 08-11-2018 12:05 AM
So I resolved all this. I just followed the steps here to remove all my packages, then deleted the contents of my files:
rm -rf /usr/hdp/Then in Ambari I used the "Start all Services" command and it went through and installed everything again for me.
/usr/hdp/3.0.0.0-.../spark2/aux/to all the other nodes in my cluster. Now all my nodemanagers are coming up and things are looking good.
Created 08-11-2018 05:10 AM
I'm glad that all sorted now another way was deleting the particular node from the cluster and then readding it and after adding spark client on it. I have recently done that one of my test cluster recently and it worked