Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Cannot Register the other hosts in the cluster from Ambari.

avatar
Contributor

I have two hosts i was able to install ambari-server. I can log into to the ambari console and tried to register the hosts in the cluster. I cannot register the other host. Its failing due to the error i have pasted. I installed ambari-agent manually on other host. I was able to ssh and telnet on port 8080 to the ambari-server from other host but niether of them were effective. Can anyone please help me how to resolve the issue what exactly it is asking for.

==========================

Creating target directory...
==========================

Command start time 2016-08-31 16:53:50

Connection to hadoop-node1localdomain.training closed.
SSH command execution finished
host=hadoop-node1localdomain.training, exitcode=0
Command end time 2016-08-31 16:53:50

==========================
Copying common functions script...
==========================

Command start time 2016-08-31 16:53:50

scp /usr/lib/python2.6/site-packages/ambari_commons
host=hadoop-node1localdomain.training, exitcode=0
Command end time 2016-08-31 16:53:50

==========================
Copying OS type check script...
==========================

Command start time 2016-08-31 16:53:50

scp /usr/lib/python2.6/site-packages/ambari_server/os_check_type.py
host=hadoop-node1localdomain.training, exitcode=0
Command end time 2016-08-31 16:53:50

==========================
Running OS type check...
==========================

Command start time 2016-08-31 16:53:50
Cluster primary/cluster OS family is redhat6 and local/current OS family is redhat6

Connection to hadoop-node1localdomain.training closed.
SSH command execution finished
host=hadoop-node1localdomain.training, exitcode=0
Command end time 2016-08-31 16:53:50

==========================
Checking 'sudo' package on remote host...
==========================

Command start time 2016-08-31 16:53:50
sudo-1.8.6p3-24.el6.x86_64

Connection to hadoop-node1localdomain.training closed.
SSH command execution finished
host=hadoop-node1localdomain.training, exitcode=0
Command end time 2016-08-31 16:53:50

==========================
Copying repo file to 'tmp' folder...
==========================

Command start time 2016-08-31 16:53:50

scp /etc/yum.repos.d/ambari.repo
host=hadoop-node1localdomain.training, exitcode=0
Command end time 2016-08-31 16:53:50

==========================
Moving file to repo dir...
==========================

Command start time 2016-08-31 16:53:50

Connection to hadoop-node1localdomain.training closed.
SSH command execution finished
host=hadoop-node1localdomain.training, exitcode=0
Command end time 2016-08-31 16:53:50

==========================
Changing permissions for ambari.repo...
==========================

Command start time 2016-08-31 16:53:50

Connection to hadoop-node1localdomain.training closed.
SSH command execution finished
host=hadoop-node1localdomain.training, exitcode=0
Command end time 2016-08-31 16:53:51

==========================
Copying setup script file...
==========================

Command start time 2016-08-31 16:53:51

scp /usr/lib/python2.6/site-packages/ambari_server/setupAgent.py
host=hadoop-node1localdomain.training, exitcode=0
Command end time 2016-08-31 16:53:51

==========================
Running setup agent script...
==========================

Command start time 2016-08-31 16:53:51
Host registration aborted. Ambari Agent host cannot reach Ambari Server 'hadoopmaster.training:8080'. Please check the network connectivity between the Ambari Agent host and the Ambari Server

Connection to hadoop-node1localdomain.training closed.
SSH command execution finished
host=hadoop-node1localdomain.training, exitcode=1
Command end time 2016-08-31 16:53:54

ERROR: Bootstrap of host hadoop-node1localdomain.training fails because previous action finished with non-zero exit code (1)
ERROR MESSAGE: tcgetattr: Invalid argument
Connection to hadoop-node1localdomain.training closed.

STDOUT: Host registration aborted. Ambari Agent host cannot reach Ambari Server 'hadoopmaster.training:8080'. Please check the network connectivity between the Ambari Agent host and the Ambari Server

Connection to hadoop-node1localdomain.training closed.
1 ACCEPTED SOLUTION

avatar
Contributor

I was able to resolve the issue. Its just that i added the primary node at the top of /etc/hosts file. It was able to register. It was trying to connect to the first host in the /etc/hosts file.

Thanks for the help

View solution in original post

8 REPLIES 8

avatar
Guru

What is the output of the following when run on the agent host?

hostname -f hadoopmaster.training

avatar
Contributor

hadoopmaster.training

avatar
Super Guru

@Praneender Vuppala

It is not clear from your description if you can ssh from ambari-server to the other host. You stated "I was able to ssh and telnet on port 8080 to the ambari-server from other host", which is irrelevant. The other direction is important.

avatar
Contributor

I meant that i am able to ssh and telnet from the 2nd host (which was failing to register) to the host ambari-server was installed but if you can see the error description that i have pasted it says "Host registration aborted. AmbariAgent host cannot reach AmbariServer 'hadoopmaster.training:8080'.Please check the network connectivity between the AmbariAgent host and the AmbariServer"

avatar
Super Guru

@Praneender Vuppala

Thanks. Got it. Let's go step by step.

1. Check that agent and server are up and running. If they are not started, try to start them. Attempt your scenario again as such we can capture some logged errors at the step 2. Write down the approximate time of your action, as such you can go over the logs more time-oriented.

2. Check ambari-server and ambari-agent logs in the specific hosts. You can find them under /var/log/ambari-server and, respectively, /var/log/ambari-agent

cat ambari-server.log | grep ERROR

cat ambari-agent.log | grep ERROR

If it is too much, you could use something like tail

tail -10 ambari-server.log | grep ERROR

If your error occurred in a different day than the current one, then you would need to grep through the day log. In your case, if you re-attempted the failed action, that should not be the case. The error should be in the current day log.

Post the findings to this question and we will take it from there.

avatar
Contributor

master-node-status.pngnode-2-services-status.pngambari-console-1.pngambari-console-2.pngambari-console-3.png

Please check the sequence of image i have attached. I dont know whats going wrong with this. I tried lot of times i am getting the same errors. Those 2 hosts are the VM's

avatar
Contributor

node-2-ambari-agent-log.pngerror-log.png

And also the error log you asked for. Look at the log in node 2. why is it searching for ambari-server on same node (look at the log. it says connecting to ambari-server http://localhost:8440) this happened during the time host failed to register. Any thoughts/solution?

avatar
Contributor

I was able to resolve the issue. Its just that i added the primary node at the top of /etc/hosts file. It was able to register. It was trying to connect to the first host in the /etc/hosts file.

Thanks for the help