Support Questions

Find answers, ask questions, and share your expertise

Ambari host registration with server failed with Registration with the server failed.

avatar
Contributor

Node addition failed with

Registering with the server...
Registration with the server failed.
1 ACCEPTED SOLUTION

avatar
Master Mentor

@Sachin Ambardekar

On the agent host do the following first? Then try again:

yum remove ambari-agent -y
rm -rf /etc/ambari-agent/*
rm -rf /var/lib/ambari-agent/*
rm -f /usr/sbin/ambari*
rm -f /usr/lib/python2.6/site-packages/ambari_commons
rm -rf /usr/lib/python2.6/site-packages/resource_management
rm -rf /usr/lib/python2.6/site-packages/ambari_jinja2

And then try installing agent from ambari UI. Or manually install the agent as following:

wget -nv http://public-repo-1.hortonworks.com/ambari/centos6/2.x/updates/2.4.2.0/ambari.repo -O /etc/yum.repos.d/ambari.repo
yum install ambari-agent -y

.

View solution in original post

10 REPLIES 10

avatar
Master Mentor

@Sachin Ambardekar

Where do you see that error?

Can you please share the screenshot of the error if it is is in the UI?

Also do you see more detailed error inside the "/var/log/ambari-server/ambari-server.log"?

If the ambari-agent is already running then on the agent host it will be good to look at "//var/log/ambari-server/ambari-agent.log"

avatar
Contributor

I am getting this error while addition new node using Ambari. Uploaded ambari-server.log, ambari-agent.log and error screenshot doc.

Please advise.

Thank you,

Sachin Anodeadditionerror.zip

avatar
Super Guru

@Sachin Ambardekar

Can you make sure ambari server version and agent version are same -using below command -

ambari-server --version -- [ambari server]

ambari-agent --version -- [datanode3]

Below error clearly states its version mismatch issue -

ERROR 2017-02-21 22:29:57,286 Controller.py:155 - Cannot register host with non compatible agent version, hostname=datanode3.localdomain, agentVersion=2.1.1, serverVersion=2.4.2.0

avatar
Super Guru

PLease check if you have same repo file for ambari in /etc/yum.repos.d/ folder.

You can copy ambari repo file from ambari server node to datanode3 and reinstall ambari agent package.

yum remove ambari-agent

yum install ambari-agent

This should resolve the issue.

avatar
Contributor

[ambari@ambari ambari-server]$ ambari-server --version

2.4.2.0-136

[ambari@datanode3 ~]$ ambari-agent --version

2.1.1

avatar
Super Guru
@Sachin Ambardekar

Can you uninstall ambari agent on the datanode3 and install ambari agent with same version as ambari-server on datanode3 [ie. 2.4.2.0-136]

Ambari agent version output should look something like -

[ambari@datanode3 ~]$ ambari-agent --version
2.4.2.0-136

avatar
Master Mentor

@Sachin Ambardekar

On the agent host do the following first? Then try again:

yum remove ambari-agent -y
rm -rf /etc/ambari-agent/*
rm -rf /var/lib/ambari-agent/*
rm -f /usr/sbin/ambari*
rm -f /usr/lib/python2.6/site-packages/ambari_commons
rm -rf /usr/lib/python2.6/site-packages/resource_management
rm -rf /usr/lib/python2.6/site-packages/ambari_jinja2

And then try installing agent from ambari UI. Or manually install the agent as following:

wget -nv http://public-repo-1.hortonworks.com/ambari/centos6/2.x/updates/2.4.2.0/ambari.repo -O /etc/yum.repos.d/ambari.repo
yum install ambari-agent -y

.

avatar
Explorer

I am facing the exact same issue. Asked a new question before I saw this one.

For me the versions are the same, still the problem persists.

Could not find anything useful in the server logs or agent logs, so decided to start from zero again.

I removed ambari-agent from the new node and all the files and tried to install from ambari UI.

It errors out now. Server Log snippet below the snapshot:

68522-screenshot-2.png

Log shows the following:

18 Apr 2018 13:22:52,496  INFO [pool-18-thread-1] BSHostStatusCollector:55 - Request directory /var/run/ambari-server/bootstrap/2
18 Apr 2018 13:22:52,496  INFO [pool-18-thread-1] BSHostStatusCollector:62 - HostList for polling on [ambaridn01.informatica.com]
18 Apr 2018 13:23:07,054  INFO [ambari-client-thread-39] BootStrapImpl:108 - BootStrapping hosts ambaridn01.informatica.com:
18 Apr 2018 13:23:07,055  INFO [Thread-32] BSRunner:189 - Kicking off the scheduler for polling on logs in /var/run/ambari-server/bootstrap/3
18 Apr 2018 13:23:07,055  INFO [Thread-32] BSRunner:258 - Host= ambaridn01.informatica.com bs=/usr/lib/python2.6/site-packages/ambari_server/bootstrap.py requestDir=/var/run/ambari-server/bootstrap/3 user=root sshPort=22 keyfile=/var/run/ambari-server/bootstrap/3/sshKey passwordFile null server=ip-172-31-10-185.ap-south-1.compute.internal version=2.6.1.5 serverPort=8080 userRunAs=root timeout=300
18 Apr 2018 13:23:07,055  INFO [pool-19-thread-1] BSHostStatusCollector:55 - Request directory /var/run/ambari-server/bootstrap/3
18 Apr 2018 13:23:07,057  INFO [pool-19-thread-1] BSHostStatusCollector:62 - HostList for polling on [ambaridn01.informatica.com]
18 Apr 2018 13:23:07,058  INFO [Thread-32] BSRunner:286 - Bootstrap output, log=/var/run/ambari-server/bootstrap/3/bootstrap.err /var/run/ambari-server/bootstrap/3/bootstrap.out at ip-172-31-10-185.ap-south-1.compute.internal
18 Apr 2018 13:23:17,057  INFO [pool-19-thread-1] BSHostStatusCollector:55 - Request directory /var/run/ambari-server/bootstrap/3
18 Apr 2018 13:23:17,057  INFO [pool-19-thread-1] BSHostStatusCollector:62 - HostList for polling on [ambaridn01.informatica.com]
18 Apr 2018 13:23:27,058  INFO [pool-19-thread-1] BSHostStatusCollector:55 - Request directory /var/run/ambari-server/bootstrap/3
18 Apr 2018 13:23:27,058  INFO [pool-19-thread-1] BSHostStatusCollector:62 - HostList for polling on [ambaridn01.informatica.com]
18 Apr 2018 13:23:28,062  INFO [Thread-32] BSRunner:310 - Script log Mesg


INFO:root:BootStrapping hosts ['ambaridn01.informatica.com'] using /usr/lib/python2.6/site-packages/ambari_server cluster primary OS: redhat7 with user 'root'with ssh Port '22' sshKey File /var/run/ambari-server/bootstrap/3/sshKey password File null using tmp dir /var/run/ambari-server/bootstrap/3 ambari: ip-172-31-10-185.ap-south-1.compute.internal; server_port: 8080; ambari version: 2.6.1.5; user_run_as: root
INFO:root:Executing parallel bootstrap
ERROR:root:ERROR: Bootstrap of host ambaridn01.informatica.com fails because previous action finished with non-zero exit code (255)
ERROR MESSAGE: Connection to ambaridn01.informatica.com closed.
STDOUT: tput: No value for $TERM and no -T specified
tput: No value for $TERM and no -T specified


Connection to ambaridn01.informatica.com closed.


INFO:root:Finished parallel bootstrap


18 Apr 2018 13:23:28,062  INFO [pool-19-thread-1] BSHostStatusCollector:55 - Request directory /var/run/ambari-server/bootstrap/3
18 Apr 2018 13:23:28,062  INFO [pool-19-thread-1] BSHostStatusCollector:62 - HostList for polling on [ambaridn01.informatica.com]
18 Apr 2018 13:23:38,870  INFO [ambari-hearbeat-monitor] HeartbeatMonitor:318 - KAFKA_BROKER is at INSTALLED adding more payload per agent ask
18 Apr 2018 13:24:38,948  INFO [ambari-hearbeat-monitor] HeartbeatMonitor:318 - KAFKA_BROKER is at INSTALLED adding more payload per agent ask
18 Apr 2018 13:24:47,794  INFO [ambari-heartbeat-processor-0] HeartbeatProcessor:607 - State of service component METRICS_COLLECTOR of service AMBARI_METRICS of cluster mycluster has changed from INSTALLED to STARTED at host ambariserv.informatica.com according to STATUS_COMMAND report
18 Apr 2018 13:24:59,091 ERROR [ambari-client-thread-37] MetricsRequestHelper:115 - Error getting timeline metrics : Connection refused (Connection refused)
18 Apr 2018 13:24:59,092 ERROR [ambari-client-thread-37] MetricsRequestHelper:122 - Cannot connect to collector: SocketTimeoutException for ambariserv.informatica.com
18 Apr 2018 13:25:51,885  INFO [pool-17-thread-1] MetricSinkWriteShardHostnameHashingStrategy:42 - Calculated collector shard ip-172-31-10-185.ap-south-1.compute.internal based on hostname: ip-172-31-10-185.ap-south-1.compute.internal
(

avatar
Explorer

*Update*

It seems like hostname issue. The hostname -f command showed localhost instead of FQDN.

This is weird as both the hostname and hostnamectl showed FQDN.

Updating the hosts file did the trick.

So instead of

127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 FQDN

I updated it to

127.0.0.1 FQDN localhost localhost.localdomain localhost4 localhost4.localdomain4

This resolved the weird issue with the hostname command, and the server registraion completed successfully.