Created 08-05-2017 06:00 AM
Hi,
I've setup a 5 node cluster using ambari using different machines - not virtualbox VMs. To begin with I've installed only HDFS and YARN (and their dependencies).
MapReduce2 service is not starting - when I try to restart history server of mapreduce 2 service, I see this error:
Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/historyserver.py", line 134, in <module> HistoryServer().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 329, in execute method(env) File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/historyserver.py", line 101, in start skip=params.sysprep_skip_copy_tarballs_hdfs) File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/copy_tarball.py", line 267, in copy_to_hdfs replace_existing_files=replace_existing_files, File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 155, in __init__ self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run self.run_action(resource, action) File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action provider_action() File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", line 555, in action_create_on_execute self.action_delayed("create") File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", line 552, in action_delayed self.get_hdfs_resource_executor().action_delayed(action_name, self) File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", line 287, in action_delayed self._create_resource() File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", line 303, in _create_resource self._create_file(self.main_resource.resource.target, source=self.main_resource.resource.source, mode=self.mode) File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", line 418, in _create_file self.util.run_command(target, 'CREATE', method='PUT', overwrite=True, assertable_result=False, file_to_put=source, **kwargs) File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", line 199, in run_command raise Fail(err_msg) resource_management.core.exceptions.Fail: Execution of 'curl -sS -L -w '%{http_code}' -X PUT --data-binary @/usr/hdp/2.6.1.0-129/hadoop/mapreduce.tar.gz -H 'Content-Type: application/octet-stream' 'http://pap-hadoop-1.mynet.org:50070/webhdfs/v1/hdp/apps/2.6.1.0-129/mapreduce/mapreduce.tar.gz?op=CREATE&user.name=hdfs&overwrite=True&permission=444'' returned status_code=403. { "RemoteException": { "exception": "IOException", "javaClassName": "java.io.IOException", "message": "Failed to find datanode, suggest to check cluster health. excludeDatanodes=null" } }
I've verified that the datanodes are running ok - Ambari reports that all the services are running ok
What could be the problem? How do I remedy this situation?
Update: I ran service check on HDFS and it fails, with the same error as above.
Thanks,
Radha.
Created 08-05-2017 06:08 AM
Refer this related HCC link for the same issue.
Created 08-05-2017 06:17 AM
Thanks Sindhu!
I checked the link - it suggests that I verify that fqdn names are specified properly in the config files.
I'd used FQDNs while setting up the cluster using ambari - was hoping that the conf file updates will be handled automatically.
Which config file(s) should I be looking for?
Created 08-10-2017 12:42 PM
It ideally should be getting picked up from DNS or /etc/hosts files . Considering you have 5 nodes , can you add these entries to your /etc/hosts files and try it again ?
Created 09-01-2017 06:33 PM
The config files under /etc/hadoop/conf, /etc/hive/conf and /etc/hive/conf/conf.server.