Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Registration with the server failed on EC2

avatar
New Contributor

I am very new to all this stuff, but as a learning exercise I am trying to set up 4-node ambari cluster on EC2. I have master node that has both agent and server installed and three nodes with agents, all running on centOS 6.

When I am trying to make a cluster using UI and manual host registration, all nodes except master fail with the following error:

Registering with the server...
Registration with the server failed.

Though I got agents running already and hosts set up, I tried using automatic registration with SSH and got more verbose error:

==========================
Creating target directory...
==========================

Command start time 2016-04-08 06:34:14

Connection to ec2node1.hdp2 closed.
SSH command execution finished
host=ec2node1.hdp2, exitcode=0
Command end time 2016-04-08 06:34:14

==========================
Copying common functions script...
==========================

Command start time 2016-04-08 06:34:14

scp /usr/lib/python2.6/site-packages/ambari_commons
host=ec2node1.hdp2, exitcode=0
Command end time 2016-04-08 06:34:14

==========================
Copying OS type check script...
==========================

Command start time 2016-04-08 06:34:14

scp /usr/lib/python2.6/site-packages/ambari_server/os_check_type.py
host=ec2node1.hdp2, exitcode=0
Command end time 2016-04-08 06:34:14

==========================
Running OS type check...
==========================

Command start time 2016-04-08 06:34:14
Cluster primary/cluster OS family is redhat6 and local/current OS family is redhat6

Connection to ec2node1.hdp2 closed.
SSH command execution finished
host=ec2node1.hdp2, exitcode=0
Command end time 2016-04-08 06:34:14

==========================
Checking 'sudo' package on remote host...
==========================

Command start time 2016-04-08 06:34:14
sudo-1.8.6p3-20.el6_7.x86_64

Connection to ec2node1.hdp2 closed.
SSH command execution finished
host=ec2node1.hdp2, exitcode=0
Command end time 2016-04-08 06:34:14

==========================
Copying repo file to 'tmp' folder...
==========================

Command start time 2016-04-08 06:34:14

scp /etc/yum.repos.d/ambari.repo
host=ec2node1.hdp2, exitcode=0
Command end time 2016-04-08 06:34:14

==========================
Moving file to repo dir...
==========================

Command start time 2016-04-08 06:34:14

Connection to ec2node1.hdp2 closed.
SSH command execution finished
host=ec2node1.hdp2, exitcode=0
Command end time 2016-04-08 06:34:14

==========================
Changing permissions for ambari.repo...
==========================

Command start time 2016-04-08 06:34:14

Connection to ec2node1.hdp2 closed.
SSH command execution finished
host=ec2node1.hdp2, exitcode=0
Command end time 2016-04-08 06:34:15

==========================
Copying setup script file...
==========================

Command start time 2016-04-08 06:34:15

scp /usr/lib/python2.6/site-packages/ambari_server/setupAgent.py
host=ec2node1.hdp2, exitcode=0
Command end time 2016-04-08 06:34:15

==========================
Running setup agent script...
==========================

Command start time 2016-04-08 06:34:15
('WARNING 2016-04-08 06:34:21,213 AlertSchedulerHandler.py:243 - [AlertScheduler] /var/lib/ambari-agent/cache/alerts/definitions.json not found or invalid. No alerts will be scheduled until registration occurs.
INFO 2016-04-08 06:34:21,213 AlertSchedulerHandler.py:139 - [AlertScheduler] Starting <ambari_agent.apscheduler.scheduler.Scheduler object at 0x1230a50>; currently running: False
INFO 2016-04-08 06:34:21,217 hostname.py:86 - Read public hostname \'ec2-52-63-181-16.ap-southeast-2.compute.amazonaws.com\' from http://169.254.169.254/latest/meta-data/public-hostname
INFO 2016-04-08 06:34:21,220 logger.py:67 - call[\'test -w /\'] {\'sudo\': True, \'timeout\': 5}
INFO 2016-04-08 06:34:21,224 logger.py:67 - call returned (0, \'\')
INFO 2016-04-08 06:34:21,224 logger.py:67 - call[\'test -w /dev/shm\'] {\'sudo\': True, \'timeout\': 5}
INFO 2016-04-08 06:34:21,228 logger.py:67 - call returned (0, \'\')
INFO 2016-04-08 06:34:21,228 logger.py:67 - call[\'test -w /grid/1\'] {\'sudo\': True, \'timeout\': 5}
INFO 2016-04-08 06:34:21,232 logger.py:67 - call returned (0, \'\')
INFO 2016-04-08 06:34:21,232 logger.py:67 - call[\'test -w /grid/2\'] {\'sudo\': True, \'timeout\': 5}
INFO 2016-04-08 06:34:21,236 logger.py:67 - call returned (0, \'\')
INFO 2016-04-08 06:34:21,236 logger.py:67 - call[\'test -w /grid/3\'] {\'sudo\': True, \'timeout\': 5}
INFO 2016-04-08 06:34:21,239 logger.py:67 - call returned (0, \'\')
ERROR 2016-04-08 06:34:21,251 main.py:309 - Fatal exception occurred:
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/ambari_agent/main.py", line 306, in <module>
    main(heartbeat_stop_callback)
  File "/usr/lib/python2.6/site-packages/ambari_agent/main.py", line 297, in main
    ExitHelper.execute_cleanup()
TypeError: unbound method execute_cleanup() must be called with ExitHelper instance as first argument (got nothing instead)
', None)
('WARNING 2016-04-08 06:34:21,213 AlertSchedulerHandler.py:243 - [AlertScheduler] /var/lib/ambari-agent/cache/alerts/definitions.json not found or invalid. No alerts will be scheduled until registration occurs.
INFO 2016-04-08 06:34:21,213 AlertSchedulerHandler.py:139 - [AlertScheduler] Starting <ambari_agent.apscheduler.scheduler.Scheduler object at 0x1230a50>; currently running: False
INFO 2016-04-08 06:34:21,217 hostname.py:86 - Read public hostname \'ec2-52-63-181-16.ap-southeast-2.compute.amazonaws.com\' from http://169.254.169.254/latest/meta-data/public-hostname
INFO 2016-04-08 06:34:21,220 logger.py:67 - call[\'test -w /\'] {\'sudo\': True, \'timeout\': 5}
INFO 2016-04-08 06:34:21,224 logger.py:67 - call returned (0, \'\')
INFO 2016-04-08 06:34:21,224 logger.py:67 - call[\'test -w /dev/shm\'] {\'sudo\': True, \'timeout\': 5}
INFO 2016-04-08 06:34:21,228 logger.py:67 - call returned (0, \'\')
INFO 2016-04-08 06:34:21,228 logger.py:67 - call[\'test -w /grid/1\'] {\'sudo\': True, \'timeout\': 5}
INFO 2016-04-08 06:34:21,232 logger.py:67 - call returned (0, \'\')
INFO 2016-04-08 06:34:21,232 logger.py:67 - call[\'test -w /grid/2\'] {\'sudo\': True, \'timeout\': 5}
INFO 2016-04-08 06:34:21,236 logger.py:67 - call returned (0, \'\')
INFO 2016-04-08 06:34:21,236 logger.py:67 - call[\'test -w /grid/3\'] {\'sudo\': True, \'timeout\': 5}
INFO 2016-04-08 06:34:21,239 logger.py:67 - call returned (0, \'\')
ERROR 2016-04-08 06:34:21,251 main.py:309 - Fatal exception occurred:
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/ambari_agent/main.py", line 306, in <module>
    main(heartbeat_stop_callback)
  File "/usr/lib/python2.6/site-packages/ambari_agent/main.py", line 297, in main
    ExitHelper.execute_cleanup()
TypeError: unbound method execute_cleanup() must be called with ExitHelper instance as first argument (got nothing instead)
', None)

Connection to ec2node1.hdp2 closed.
SSH command execution finished
host=ec2node1.hdp2, exitcode=0
Command end time 2016-04-08 06:34:23

Registering with the server...
Registration with the server failed.

Any help or pointers would be greatly appreciated!

1 ACCEPTED SOLUTION

avatar
Expert Contributor
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
4 REPLIES 4

avatar
Expert Contributor
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar
New Contributor

Thanks @rnettleton, I got to similar solution while waiting for question to be moderated, but your answer definitely makes it clearer why it wasn't working!

avatar
Rising Star

You are running into AMBARI-14431. This should happen on the latest. Can you use the latest or update your script manually?

avatar
New Contributor

Check your .ini file on all the slaves/agents

/etc/ambari-agent/conf/ambari-agent.ini

And then make sure your ports below are open on your MASTER

url_port=8440

secured_url_port=8441