I am getting Hostname Resolution Issue while trying to deploy 4 node cluster on Azure VM's. Although i am able to register all my hosts successfully but i am seeing a warning. Below are the details
My ambari-agents are successfully registered with ambari-server but i am seeing a warning which says that hostname resolution issue in which it says not all the hosts could resolve hostname of other hosts.
Although i have added the IP addresses and hostnames of all the hosts in /etc/hosts files of all the 4 nodes. But still it is not able to resolve. Anything else to do?
Below is my example /etc/hosts file on edge node(where abcd, etf, sco, jfjf are hostnames of edge, master, slave1 and slave 2 nodes resp). Same goes with all other nodes as well
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
Please note that the ambari.ini file on slave1, slave2 and edge node has IP address as hostname while master node file has hostname of masternode. This is how it is setup and is working. Because i tried to register ambari agents by giving hostname in ambari.ini files of slaves and edge node but i was getting failure but master ambari agent was able to register with ambari server with fqdn.
@rahul gulati I do not see that there will be any issues , as long as you see correct hostnames listed in the following URL:
Please replace the "AMBARI_HOST" and "CLUSTER_NAME" in your browser and then check if you are seeing correct hosts with correct hostnames.
For more detail look on the issue it would be good if you can share the exact error/warning message that you are getting while starting agents/server.
Are you sure that when you are running the following command on those hosts individually then you are getting the correct FQDN?
- Ambari agent sends registration request to ambari-server every time when agent is restarted. In the registration request it sends the FQDN information of it's own as well. Agent uses the following approach to findout the FQDN using python. So please check if all your hosts are returning correct hostname
python -c "import socket; print socket.getfqdn()"
Also in local system it might get using which is not actually FQDN as above. So local agent might behave bit differently then remote.
python -c "import socket; print socket.gethostname()"
i am getting correct fqdn when i run hostname -f. I matched it with my azure portal as well. Only thing which is different is that in azure portal it has .westus.azure.com appended in the end of hostname while on machines its just hostname without above suffix. But i tried giving that as well in /etc/hosts and it didnot worked out.
If you are using "azur" (from your hostname it looks like) then you should be using "Private IP" address inside your /etc/hosts file (if you are using public ip then please change it). Although it might not be related to your query.
Just to reiterate
" Does it has something to do with IP in ambari.ini file slaves and edge node" as i have mentioned earlier as well that my ambari.ini file on slaves and edge nodes have IP of master in server hostname while master node has fqdn of server hostname.
Just mentioning again.