Created 10-25-2016 01:09 PM
I'm not being able to start the NameNode with the following error.
It seems a resource management related error but have no clue what's wrong.
CPU : i7, RAM : 16GB, HDD : 2TB
Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 420, in <module> NameNode().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 280, in execute method(env) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 101, in start upgrade_suspended=params.upgrade_suspended, env=env) File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk return fn(*args, **kwargs) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py", line 215, in namenode create_hdfs_directories() File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py", line 282, in create_hdfs_directories mode=0777, File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 155, in __init__ self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run self.run_action(resource, action) File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action provider_action() File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", line 459, in action_create_on_execute self.action_delayed("create") File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", line 456, in action_delayed self.get_hdfs_resource_executor().action_delayed(action_name, self) File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", line 247, in action_delayed self._assert_valid() File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", line 231, in _assert_valid self.target_status = self._get_file_status(target) File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", line 292, in _get_file_status list_status = self.util.run_command(target, 'GETFILESTATUS', method='GET', ignore_status_codes=['404'], assertable_result=False) File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", line 192, in run_command raise Fail(err_msg) resource_management.core.exceptions.Fail
Created 10-25-2016 03:38 PM
Can you please verify nameNode logs for correct error message by logging to that particular host.
Created 10-25-2016 11:12 PM
May I ask where the log for NameNode resides?
Created 10-26-2016 08:09 AM
Hi @Yu Song ,
log in to the node serving the Namenode service, then by default the logfile is under /var/log/hadoop/hdfs , but you can check the log-dir in Ambari via HDFS =>Configs => then search for "Hadoop log dir prefix" property under "advanced hadoop-env" tab
Created 10-28-2016 08:24 AM
@Gerd Koenig, @avoma
I've checked some of the logs under the /var/log/hadoop/hdfs/
ex. ambari-agent/ambari-agent.log
contains the following error message which, I guess, is one of the reason some services don't start up.
But, I have no idea how to work around this issue so any help is greatly appreciated.
ERROR 2016-10-28 17:15:58,673 script_alert.py:119 - [Alert][yarn_nodemanager_health] Failed with result CRITICAL: ['Connection failed to http://myhost.fqdn.local:8042/ws/v1/node/info (Traceback (most recent call last):\n File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/alerts/alert_nodemanager_health.py", line 171, in execute\n url_response = urllib2.urlopen(query, timeout=connection_timeout)\n File "/usr/lib/python2.7/urllib2.py", line 127, in urlopen\n return _opener.open(url, data, timeout)\n File "/usr/lib/python2.7/urllib2.py", line 404, in open\n response = self._open(req, data)\n File "/usr/lib/python2.7/urllib2.py", line 422, in _open\n \'_open\', req)\n File "/usr/lib/python2.7/urllib2.py", line 382, in _call_chain\n result = func(*args)\n File "/usr/lib/python2.7/urllib2.py", line 1214, in http_open\n return self.do_open(httplib.HTTPConnection, req)\n File "/usr/lib/python2.7/urllib2.py", line 1184, in do_open\n raise URLError(err)\nURLError: <urlopen error [Errno 111] Connection refused>\n)']
Created 02-16-2017 01:19 PM
i am getting the same issue,
any solution for this error
Created 03-16-2017 06:17 AM
@ramakanth ileni As mentioned in previous reply, please verify namenode logs which can be found on the node where NN service is deployed. Are you seeing any exceptions?