Created 08-11-2016 04:02 PM
I have one running host, but when trying to add additional nodes, the status keeps saying "failed". To make things worse, there are no log files in Ambari. This is all it says:
========================== Creating target directory... ========================== Command start time 2016-08-11 16:42:54
Can someone tell me where I can find what I did wrong ? Or more importantly: how can I fix this?
Thanks in advance.
Specs:
Ubuntu 14.04.5 LTS
Ambari 2.2.2.0
Java 1.8.0_101
Hostnames are valid FQDNs, all servers can connect by ssh without password, firewall is disabled (I did :: sudo ufw disable), IPv6 has been disabled (in file /etc/sysctl.conf), THP has been disabled, NTP enabled, Java path (asked when setting up ambari server) leads to custom Java version 1.8.
Please note that I'm a complete Linux and Hadoop noob.
Created 08-19-2016 12:58 PM
Problem solved by reinstalling using CentOS 6.
Created 08-11-2016 04:13 PM
@Nicolas De Paepe Can you please post the output of /var/log/ambari-server/ambari-server.log. It should have adequate details.
Created 08-11-2016 05:29 PM
Note that I excluded some lines beginnen with "at ..." (these were just lists of services). So this is probably a problem due to a wrong hostname?
INFO:root:BootStrapping hosts ['ubuntuvm1.com', 'ubuntuvm2.com'] using /usr/lib/python2.6/site-packages/ambari_server cluster primary OS: ubuntu14 with user 'hduser' sshKey File /var/run/ambari-server/bootstrap/6/sshKey password File null using tmp dir /var/run/ambari-server/bootstrap/6 ambari: ubuntuvm.com; server_port: 8080; ambari version: 2.2.2.0; user__run_as: root INFO:root:Executing parallel bootstrap Bootstrap process timed out. It was destroyed. 11 Aug 2016 18:25:16,717 INFO [pool-15-thread-1] BSHostStatusCollector:55 - Request directory /var/run/ambari-server/bootstrap/6 11 Aug 2016 18:25:16,717 INFO [pool-15-thread-1] BSHostStatusCollector:62 - HostList for polling on [ubuntuvm1.com, ubuntuvm2.com] 11 Aug 2016 18:25:17,509 ERROR [qtp-ambari-client-181] AbstractResourceProvider:280 - Caught AmbariException when creating a resource org.apache.ambari.server.HostNotFoundException: Host not found, hostname= at .. 11 Aug 2016 18:25:17,511 ERROR [qtp-ambari-client-181] BaseManagementHandler:57 - Caught a system exception while attempting to create a resource: An internal system exception occurred: Host not found, hostname = org.apache.ambari.server.controller.spi.SystemException: An internal system exception occurred: Host not found, hostname= at org.apache.ambari.server.controller.internal.AbstractResourceProvider.createResources(AbstractResourceProvider.java:282) ... Caused by: org.apache.ambari.server.HostNotFoundException: Host not found, hostname= at .. 11 Aug 2016 18:25:22,292 INFO [qtp-ambari-agent-314] HeartBeatHandler:309 - HeartBeatHandler.sendCommands: sending ExecutionCommand for host ubuntuvm.com, role check_host, roleCommand ACTIONEXECUTE, and command ID 88-0, task ID 646
Thanks for your quick reply!
Created 08-11-2016 04:41 PM
@Nicolas De Paepe are you running this on a IaaS provider? which one? I have seen this due to network latency issue. Also I agree with @ssharma please post your log or go through your log. it should have info in it.
Created 08-11-2016 05:29 PM
I am only testing out Hadoop, using 2 seperate laptops running virtual machines.
Code:
INFO:root:BootStrapping hosts ['ubuntuvm1.com', 'ubuntuvm2.com'] using /usr/lib/python2.6/site-packages/ambari_server cluster primary OS: ubuntu14 with user 'hduser' sshKey File /var/run/ambari-server/bootstrap/6/sshKey password File null using tmp dir /var/run/ambari-server/bootstrap/6 ambari: ubuntuvm.com; server_port: 8080; ambari version: 2.2.2.0; user__run_as: root INFO:root:Executing parallel bootstrap Bootstrap process timed out. It was destroyed. 11 Aug 2016 18:25:16,717 INFO [pool-15-thread-1] BSHostStatusCollector:55 - Request directory /var/run/ambari-server/bootstrap/6 11 Aug 2016 18:25:16,717 INFO [pool-15-thread-1] BSHostStatusCollector:62 - HostList for polling on [ubuntuvm1.com, ubuntuvm2.com] 11 Aug 2016 18:25:17,509 ERROR [qtp-ambari-client-181] AbstractResourceProvider:280 - Caught AmbariException when creating a resource org.apache.ambari.server.HostNotFoundException: Host not found, hostname= at .. 11 Aug 2016 18:25:17,511 ERROR [qtp-ambari-client-181] BaseManagementHandler:57 - Caught a system exception while attempting to create a resource: An internal system exception occurred: Host not found, hostname = org.apache.ambari.server.controller.spi.SystemException: An internal system exception occurred: Host not found, hostname= at org.apache.ambari.server.controller.internal.AbstractResourceProvider.createResources(AbstractResourceProvider.java:282) ... Caused by: org.apache.ambari.server.HostNotFoundException: Host not found, hostname= at .. 11 Aug 2016 18:25:22,292 INFO [qtp-ambari-agent-314] HeartBeatHandler:309 - HeartBeatHandler.sendCommands: sending ExecutionCommand for host ubuntuvm.com, role check_host, roleCommand ACTIONEXECUTE, and command ID 88-0, task ID 646
Thanks for your quick reply!
Created 08-11-2016 05:45 PM
@Nicolas De Paepe Yea.Looks like this is an issue with wrong hostname.
Can you confirm these :
1) From ambari server, you are able to ssh into this new host.
2) From the new host try to telnet to ambari_server host over 8440
ie telnet <ambari_server_host> 8440
Created 08-12-2016 06:49 AM
1) First one confirmed (works on both hosts)
hduser@ubuntuVM:~$ ssh ubuntuvm2.com Welcome to Ubuntu 14.04.5 LTS (GNU/Linux 4.4.0-31-generic x86_64) * Documentation: <a href="https://help.ubuntu.com/">https://help.ubuntu.com/</a> System information disabled due to load higher than 1.0 New release '16.04.1 LTS' available. Run 'do-release-upgrade' to upgrade to it. Last login: Fri Aug 12 08:28:46 2016 from ubuntuvm.com hduser@ubuntuvm2:~$
2) After a while, the connection is automatically interrupted ("closed by foreign host").
hduser@ubuntuVM2:~$ telnet ubuntuvm.com 8440 Trying 192.168.10.131... Connected to ubuntuVM.com. Escape character is '^]'. Connection closed by foreign host. hduser@ubuntuVM2:~$
Created 08-19-2016 12:58 PM
Problem solved by reinstalling using CentOS 6.