Support Questions

Find answers, ask questions, and share your expertise

Extra nodes fail to register on Ambari 2.2 without error

avatar

I have one running host, but when trying to add additional nodes, the status keeps saying "failed". To make things worse, there are no log files in Ambari. This is all it says:

========================== 
Creating target directory... 
========================== 
Command start time 2016-08-11 16:42:54

Can someone tell me where I can find what I did wrong ? Or more importantly: how can I fix this?

Thanks in advance.

Specs:

Ubuntu 14.04.5 LTS

Ambari 2.2.2.0

Java 1.8.0_101

Hostnames are valid FQDNs, all servers can connect by ssh without password, firewall is disabled (I did :: sudo ufw disable), IPv6 has been disabled (in file /etc/sysctl.conf), THP has been disabled, NTP enabled, Java path (asked when setting up ambari server) leads to custom Java version 1.8.

Please note that I'm a complete Linux and Hadoop noob.

1 ACCEPTED SOLUTION

avatar

Problem solved by reinstalling using CentOS 6.

View solution in original post

7 REPLIES 7

avatar
Super Collaborator

@Nicolas De Paepe Can you please post the output of /var/log/ambari-server/ambari-server.log. It should have adequate details.

avatar

Note that I excluded some lines beginnen with "at ..." (these were just lists of services). So this is probably a problem due to a wrong hostname?

INFO:root:BootStrapping hosts ['ubuntuvm1.com', 'ubuntuvm2.com'] using /usr/lib/python2.6/site-packages/ambari_server cluster primary OS: ubuntu14 with user 'hduser' sshKey File /var/run/ambari-server/bootstrap/6/sshKey password File null using tmp dir /var/run/ambari-server/bootstrap/6 ambari: ubuntuvm.com; server_port: 8080; ambari version: 2.2.2.0; user__run_as: root
 
INFO:root:Executing parallel bootstrap
 
 
Bootstrap process timed out. It was destroyed.
11 Aug 2016 18:25:16,717  INFO [pool-15-thread-1] BSHostStatusCollector:55 - Request directory /var/run/ambari-server/bootstrap/6
11 Aug 2016 18:25:16,717  INFO [pool-15-thread-1] BSHostStatusCollector:62 - HostList for polling on [ubuntuvm1.com, ubuntuvm2.com]
11 Aug 2016 18:25:17,509 ERROR [qtp-ambari-client-181] AbstractResourceProvider:280 - Caught AmbariException when creating a resource
org.apache.ambari.server.HostNotFoundException: Host not found, hostname=
                at ..
 
 
11 Aug 2016 18:25:17,511 ERROR [qtp-ambari-client-181] BaseManagementHandler:57 - Caught a system exception while attempting to create a resource: An internal system exception occurred: Host not found, hostname =
org.apache.ambari.server.controller.spi.SystemException: An internal system exception occurred: Host not found, hostname=
        at org.apache.ambari.server.controller.internal.AbstractResourceProvider.createResources(AbstractResourceProvider.java:282)
                ...
 
 
Caused by: org.apache.ambari.server.HostNotFoundException: Host not found, hostname=
                at ..
 
11 Aug 2016 18:25:22,292  INFO [qtp-ambari-agent-314] HeartBeatHandler:309 - HeartBeatHandler.sendCommands: sending ExecutionCommand for host ubuntuvm.com, role check_host, roleCommand ACTIONEXECUTE, and command ID 88-0, task ID 646

Thanks for your quick reply!

avatar
Master Guru

@Nicolas De Paepe are you running this on a IaaS provider? which one? I have seen this due to network latency issue. Also I agree with @ssharma please post your log or go through your log. it should have info in it.

avatar

I am only testing out Hadoop, using 2 seperate laptops running virtual machines.

Code:

INFO:root:BootStrapping hosts ['ubuntuvm1.com', 'ubuntuvm2.com'] using /usr/lib/python2.6/site-packages/ambari_server cluster primary OS: ubuntu14 with user 'hduser' sshKey File /var/run/ambari-server/bootstrap/6/sshKey password File null using tmp dir /var/run/ambari-server/bootstrap/6 ambari: ubuntuvm.com; server_port: 8080; ambari version: 2.2.2.0; user__run_as: root
 
INFO:root:Executing parallel bootstrap
 
 
Bootstrap process timed out. It was destroyed.
11 Aug 2016 18:25:16,717  INFO [pool-15-thread-1] BSHostStatusCollector:55 - Request directory /var/run/ambari-server/bootstrap/6
11 Aug 2016 18:25:16,717  INFO [pool-15-thread-1] BSHostStatusCollector:62 - HostList for polling on [ubuntuvm1.com, ubuntuvm2.com]
11 Aug 2016 18:25:17,509 ERROR [qtp-ambari-client-181] AbstractResourceProvider:280 - Caught AmbariException when creating a resource
org.apache.ambari.server.HostNotFoundException: Host not found, hostname=
                at ..
 
 
11 Aug 2016 18:25:17,511 ERROR [qtp-ambari-client-181] BaseManagementHandler:57 - Caught a system exception while attempting to create a resource: An internal system exception occurred: Host not found, hostname =
org.apache.ambari.server.controller.spi.SystemException: An internal system exception occurred: Host not found, hostname=
        at org.apache.ambari.server.controller.internal.AbstractResourceProvider.createResources(AbstractResourceProvider.java:282)
                ...
 
 
Caused by: org.apache.ambari.server.HostNotFoundException: Host not found, hostname=
                at ..
 
11 Aug 2016 18:25:22,292  INFO [qtp-ambari-agent-314] HeartBeatHandler:309 - HeartBeatHandler.sendCommands: sending ExecutionCommand for host ubuntuvm.com, role check_host, roleCommand ACTIONEXECUTE, and command ID 88-0, task ID 646

Thanks for your quick reply!

avatar
Super Collaborator

@Nicolas De Paepe Yea.Looks like this is an issue with wrong hostname.

Can you confirm these :

1) From ambari server, you are able to ssh into this new host.

2) From the new host try to telnet to ambari_server host over 8440

ie telnet <ambari_server_host> 8440

avatar

1) First one confirmed (works on both hosts)

hduser@ubuntuVM:~$ ssh ubuntuvm2.com
Welcome to Ubuntu 14.04.5 LTS (GNU/Linux 4.4.0-31-generic x86_64)

 * Documentation:  <a href="https://help.ubuntu.com/">https://help.ubuntu.com/</a>

 System information disabled due to load higher than 1.0

New release '16.04.1 LTS' available.
Run 'do-release-upgrade' to upgrade to it.


Last login: Fri Aug 12 08:28:46 2016 from ubuntuvm.com
hduser@ubuntuvm2:~$

2) After a while, the connection is automatically interrupted ("closed by foreign host").

hduser@ubuntuVM2:~$ telnet ubuntuvm.com 8440
Trying 192.168.10.131...
Connected to ubuntuVM.com.
Escape character is '^]'.
Connection closed by foreign host.
hduser@ubuntuVM2:~$

avatar

Problem solved by reinstalling using CentOS 6.