Created 11-24-2017 04:17 AM
Hi,
@Jay Kumar SenSharma
I'm getting following error when trying to access a file on HDFS. I am able to ping "node1.mydomain" from other machine where this pyspark script is running.
File "/opt/<mysoftware>/depLibs/usr/local/lib/python2.7/site-packages/hdfs/client.py", line 44, in _on_error raise HdfsError(message) HdfsError: <HTML><HEAD> <TITLE>Network Error</TITLE> </HEAD> <BODY> <FONT face="Helvetica"> <big><strong></strong></big><BR> </FONT> <blockquote> <TABLE border=0 cellPadding=1 width="80%"> <TR><TD> <FONT face="Helvetica"> <big>Network Error (dns_unresolved_hostname)</big> <BR> <BR> </FONT> </TD></TR> <TR><TD> <FONT face="Helvetica"> Your requested host "node1.mydomain" could not be resolved by DNS. </FONT> </TD></TR> <TR><TD> <FONT face="Helvetica"> </FONT> </TD></TR> <TR><TD> <FONT face="Helvetica" SIZE=2> <BR> For assistance, contact your network support team. </FONT> </TD></TR> </TABLE> </blockquote> </FONT> </BODY></HTML>
Created 11-24-2017 04:42 AM
@Jay Kumar SenSharma Thanks for your help. It turned out that I had to remove proxy configurations and restart all services to get that to effect rather than just HDFS service restart. Post that, the hdfs read operation started working 🙂
Created 11-24-2017 04:17 AM
@Jay Kumar SenSharma request your help on this issue
Created 11-24-2017 04:23 AM
When Spark Job Runs then it might be executing on various cluster hosts (nodes) so you will need to make sure that the DNS entry (or the "/etc/hosts") file entry is configured properly for all the Hosts present inside the cluster to resolve the "node1.mydomain"
.
So please check if you have correct "/etc/hosts" file entry in all the cluster ndoes to resolved the "node1.mydomain" as following: (I am assuming that 10.10.10.10 is an example IP address for your node1.mydomain) Please replace the IP Address with the actual IP address.
10.10.10.10 node1.mydomain
.
.
Created 11-24-2017 04:28 AM
@Jay Kumar SenSharma Thanks for your reply. Yes I have entries in /etc/hosts. All of this was working before I restarted all services of HortonWorks
Created 11-24-2017 04:32 AM
Looks like in your Network the Hostname recognization is happening via DNS Server instead of "etc/hosts" file.
If it is failing for any particular Host then another possible cause any be that any specific node of your cluster might have some environmental difference between that node compared to the rest of the cluster that might be causing DNS resolution to not work properly.
The
Network infrastructure team might help in resolving the DNS issues. As i
suspect that it might be related to the DNS settings.
Created 11-24-2017 04:42 AM
@Jay Kumar SenSharma Thanks for your help. It turned out that I had to remove proxy configurations and restart all services to get that to effect rather than just HDFS service restart. Post that, the hdfs read operation started working 🙂
Created 11-24-2017 04:47 AM
Wow!! good to share the findings. Yes, the explicit proxy setting will force the cluster ndoes to pass their requests via Network Proxy and which might not be aware of the "node1.mydomain". So good to remove the proxy settings.
It will be also great to close this thread by clicking on the "Accept" button so that other HCC users can quickly browse the solution when they encounter the same issue.