Created on 02-22-2019 08:50 AM - edited 02-22-2019 08:58 AM
Good evening. This message I receive after add two new hosts to the hadoop cluser.
Tehnical info:
Nodes - Debian GNU/Linux 8.9 (jessie) cloudera-manager-agent 5.13.0-1.cm5130.p0.55~jessie-cm5 amd64 cloudera-manager-daemons 5.13.0-1.cm5130.p0.55~jessie-cm5 all
Earlier we add nodes using web interface. But this 2 nodes was added by hand:
1. Add repo cloudera
2. Install packages from repo
3. Configure /etc/cloudera-scm-agent/config.ini:
server_host=hadoop-server.test.com
After start clouder agents, I saw tho strangeness In cloudera web interface:
1. All Hosts - my new host - CDH Version is None
2. Error from topic: "The hostname and canonical name for this host are not consistent when checked from a Java process."
I dont know is connected to each other this issues or not.
I try resolve second issue:
python -c 'import socket; print socket.getfqdn(), socket.gethostbyname(socket.getfqdn())' regensburg.test.com 1.1.1.1
This command use from /etc/cloudera-scm-agent/config.ini file. Response is valid - A indicate to this IP and PTR indicate to domain name.
But, from internet space I found this command and it return other response:
java -classpath /usr/share/cmf/lib/agent-5.*.jar com.cloudera.cmon.agent.DnsTest {"status": "0", "ip": "1.1.1.1", "hostname": "regensburg", "canonicalname": "regensburg.test.com", "localhostDuration": "4", "canonicalnameDuration": "1" }
Ok, hostname is short.
root@regensburg ~ # hostname regensburg root@regensburg ~ # hostname -f regensburg.test.com
In /etc/hosts we had some records:
1.1.1.1 regensburg.test.com regensburg
Why arise this error? In other nodes we didn't receive this error, some info from working node:
root@berlin ~ # python -c "import socket; print socket.getfqdn(); print socket.gethostbyname(socket.getfqdn())" berlin.test.com 2.2.2.2 java -classpath /usr/share/cmf/lib/agent-5.*.jar com.cloudera.cmon.agent.DnsTest {"status": "0", "ip": "2.2.2.2", "hostname": "berlin.test.com", "canonicalname": "berlin.test.com", "localhostDuration": "4", "canonicalnameDuration": "0" } root@berlin ~ # hostname berlin root@berlin ~ # hostname -f berlin.test.com root@berlin ~ # cat /etc/hosts | grep berli 2.2.2.2 berlin.test.com berlin root@berlin ~ # dpkg -l | grep clouder ii cloudera-manager-agent 5.13.0-1.cm5130.p0.55~jessie-cm5 amd64 The Cloudera Manager Agent ii cloudera-manager-daemons 5.13.0-1.cm5130.p0.55~jessie-cm5 all Provides daemons for monitoring Hadoop and related tools.
From com.cloudera.cmon.agent.DnsTest on all other nodes we receive FQDN hostname, but on this new - short hostname.
Created 02-25-2019 06:41 AM
Just a quick clarification, the DnsTest code calls straight the Java getHostName() and getCanonicalHostName() methods to retrieve the values. The reported mismatch was either caused by the name resolution settings at that time or by nscd caching the old values on the host.
Created 02-23-2019 07:49 AM
very strange - today error about the bad hostname have gone. This issue appear in 14 february (after input new hosts to cluster) and only dissappeared 23 february. I think that this some cache on the side of Cloudera soft.
Created 02-25-2019 06:41 AM
Just a quick clarification, the DnsTest code calls straight the Java getHostName() and getCanonicalHostName() methods to retrieve the values. The reported mismatch was either caused by the name resolution settings at that time or by nscd caching the old values on the host.
Created 02-26-2019 12:53 AM
Hello @unix196,
Good to see issue got resolved 🙂
In addition to above response, I faced the same due to dns misconfig and dns server caching.