Created 02-22-2017 06:41 AM
Node addition failed with
Registering with the server... Registration with the server failed.
Created 02-22-2017 07:08 AM
On the agent host do the following first? Then try again:
yum remove ambari-agent -y rm -rf /etc/ambari-agent/* rm -rf /var/lib/ambari-agent/* rm -f /usr/sbin/ambari* rm -f /usr/lib/python2.6/site-packages/ambari_commons rm -rf /usr/lib/python2.6/site-packages/resource_management rm -rf /usr/lib/python2.6/site-packages/ambari_jinja2
And then try installing agent from ambari UI. Or manually install the agent as following:
wget -nv http://public-repo-1.hortonworks.com/ambari/centos6/2.x/updates/2.4.2.0/ambari.repo -O /etc/yum.repos.d/ambari.repo yum install ambari-agent -y
.
Created 02-22-2017 06:43 AM
Where do you see that error?
Can you please share the screenshot of the error if it is is in the UI?
Also do you see more detailed error inside the "/var/log/ambari-server/ambari-server.log"?
If the ambari-agent is already running then on the agent host it will be good to look at "//var/log/ambari-server/ambari-agent.log"
Created 02-22-2017 06:54 AM
I am getting this error while addition new node using Ambari. Uploaded ambari-server.log, ambari-agent.log and error screenshot doc.
Please advise.
Thank you,
Sachin Anodeadditionerror.zip
Created 02-22-2017 07:01 AM
Can you make sure ambari server version and agent version are same -using below command -
ambari-server --version -- [ambari server]
ambari-agent --version -- [datanode3]
Below error clearly states its version mismatch issue -
ERROR 2017-02-21 22:29:57,286 Controller.py:155 - Cannot register host with non compatible agent version, hostname=datanode3.localdomain, agentVersion=2.1.1, serverVersion=2.4.2.0
Created 02-22-2017 07:03 AM
PLease check if you have same repo file for ambari in /etc/yum.repos.d/ folder.
You can copy ambari repo file from ambari server node to datanode3 and reinstall ambari agent package.
yum remove ambari-agent
yum install ambari-agent
This should resolve the issue.
Created 02-22-2017 07:05 AM
[ambari@ambari ambari-server]$ ambari-server --version
2.4.2.0-136
[ambari@datanode3 ~]$ ambari-agent --version
2.1.1
Created 02-22-2017 08:57 AM
Can you uninstall ambari agent on the datanode3 and install ambari agent with same version as ambari-server on datanode3 [ie. 2.4.2.0-136]
Ambari agent version output should look something like -
[ambari@datanode3 ~]$ ambari-agent --version 2.4.2.0-136
Created 02-22-2017 07:08 AM
On the agent host do the following first? Then try again:
yum remove ambari-agent -y rm -rf /etc/ambari-agent/* rm -rf /var/lib/ambari-agent/* rm -f /usr/sbin/ambari* rm -f /usr/lib/python2.6/site-packages/ambari_commons rm -rf /usr/lib/python2.6/site-packages/resource_management rm -rf /usr/lib/python2.6/site-packages/ambari_jinja2
And then try installing agent from ambari UI. Or manually install the agent as following:
wget -nv http://public-repo-1.hortonworks.com/ambari/centos6/2.x/updates/2.4.2.0/ambari.repo -O /etc/yum.repos.d/ambari.repo yum install ambari-agent -y
.
Created on 04-18-2018 01:29 PM - edited 08-19-2019 03:43 AM
I am facing the exact same issue. Asked a new question before I saw this one.
For me the versions are the same, still the problem persists.
Could not find anything useful in the server logs or agent logs, so decided to start from zero again.
I removed ambari-agent from the new node and all the files and tried to install from ambari UI.
It errors out now. Server Log snippet below the snapshot:
Log shows the following:
18 Apr 2018 13:22:52,496 INFO [pool-18-thread-1] BSHostStatusCollector:55 - Request directory /var/run/ambari-server/bootstrap/2 18 Apr 2018 13:22:52,496 INFO [pool-18-thread-1] BSHostStatusCollector:62 - HostList for polling on [ambaridn01.informatica.com] 18 Apr 2018 13:23:07,054 INFO [ambari-client-thread-39] BootStrapImpl:108 - BootStrapping hosts ambaridn01.informatica.com: 18 Apr 2018 13:23:07,055 INFO [Thread-32] BSRunner:189 - Kicking off the scheduler for polling on logs in /var/run/ambari-server/bootstrap/3 18 Apr 2018 13:23:07,055 INFO [Thread-32] BSRunner:258 - Host= ambaridn01.informatica.com bs=/usr/lib/python2.6/site-packages/ambari_server/bootstrap.py requestDir=/var/run/ambari-server/bootstrap/3 user=root sshPort=22 keyfile=/var/run/ambari-server/bootstrap/3/sshKey passwordFile null server=ip-172-31-10-185.ap-south-1.compute.internal version=2.6.1.5 serverPort=8080 userRunAs=root timeout=300 18 Apr 2018 13:23:07,055 INFO [pool-19-thread-1] BSHostStatusCollector:55 - Request directory /var/run/ambari-server/bootstrap/3 18 Apr 2018 13:23:07,057 INFO [pool-19-thread-1] BSHostStatusCollector:62 - HostList for polling on [ambaridn01.informatica.com] 18 Apr 2018 13:23:07,058 INFO [Thread-32] BSRunner:286 - Bootstrap output, log=/var/run/ambari-server/bootstrap/3/bootstrap.err /var/run/ambari-server/bootstrap/3/bootstrap.out at ip-172-31-10-185.ap-south-1.compute.internal 18 Apr 2018 13:23:17,057 INFO [pool-19-thread-1] BSHostStatusCollector:55 - Request directory /var/run/ambari-server/bootstrap/3 18 Apr 2018 13:23:17,057 INFO [pool-19-thread-1] BSHostStatusCollector:62 - HostList for polling on [ambaridn01.informatica.com] 18 Apr 2018 13:23:27,058 INFO [pool-19-thread-1] BSHostStatusCollector:55 - Request directory /var/run/ambari-server/bootstrap/3 18 Apr 2018 13:23:27,058 INFO [pool-19-thread-1] BSHostStatusCollector:62 - HostList for polling on [ambaridn01.informatica.com] 18 Apr 2018 13:23:28,062 INFO [Thread-32] BSRunner:310 - Script log Mesg INFO:root:BootStrapping hosts ['ambaridn01.informatica.com'] using /usr/lib/python2.6/site-packages/ambari_server cluster primary OS: redhat7 with user 'root'with ssh Port '22' sshKey File /var/run/ambari-server/bootstrap/3/sshKey password File null using tmp dir /var/run/ambari-server/bootstrap/3 ambari: ip-172-31-10-185.ap-south-1.compute.internal; server_port: 8080; ambari version: 2.6.1.5; user_run_as: root INFO:root:Executing parallel bootstrap ERROR:root:ERROR: Bootstrap of host ambaridn01.informatica.com fails because previous action finished with non-zero exit code (255) ERROR MESSAGE: Connection to ambaridn01.informatica.com closed. STDOUT: tput: No value for $TERM and no -T specified tput: No value for $TERM and no -T specified Connection to ambaridn01.informatica.com closed. INFO:root:Finished parallel bootstrap 18 Apr 2018 13:23:28,062 INFO [pool-19-thread-1] BSHostStatusCollector:55 - Request directory /var/run/ambari-server/bootstrap/3 18 Apr 2018 13:23:28,062 INFO [pool-19-thread-1] BSHostStatusCollector:62 - HostList for polling on [ambaridn01.informatica.com] 18 Apr 2018 13:23:38,870 INFO [ambari-hearbeat-monitor] HeartbeatMonitor:318 - KAFKA_BROKER is at INSTALLED adding more payload per agent ask 18 Apr 2018 13:24:38,948 INFO [ambari-hearbeat-monitor] HeartbeatMonitor:318 - KAFKA_BROKER is at INSTALLED adding more payload per agent ask 18 Apr 2018 13:24:47,794 INFO [ambari-heartbeat-processor-0] HeartbeatProcessor:607 - State of service component METRICS_COLLECTOR of service AMBARI_METRICS of cluster mycluster has changed from INSTALLED to STARTED at host ambariserv.informatica.com according to STATUS_COMMAND report 18 Apr 2018 13:24:59,091 ERROR [ambari-client-thread-37] MetricsRequestHelper:115 - Error getting timeline metrics : Connection refused (Connection refused) 18 Apr 2018 13:24:59,092 ERROR [ambari-client-thread-37] MetricsRequestHelper:122 - Cannot connect to collector: SocketTimeoutException for ambariserv.informatica.com 18 Apr 2018 13:25:51,885 INFO [pool-17-thread-1] MetricSinkWriteShardHostnameHashingStrategy:42 - Calculated collector shard ip-172-31-10-185.ap-south-1.compute.internal based on hostname: ip-172-31-10-185.ap-south-1.compute.internal (
Created 04-18-2018 02:10 PM
*Update*
It seems like hostname issue. The hostname -f command showed localhost instead of FQDN.
This is weird as both the hostname and hostnamectl showed FQDN.
Updating the hosts file did the trick.
So instead of
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 FQDN
I updated it to
127.0.0.1 FQDN localhost localhost.localdomain localhost4 localhost4.localdomain4
This resolved the weird issue with the hostname command, and the server registraion completed successfully.