Created 02-13-2017 02:05 PM
Hi,
I tried to install a ambari server on a local machine (suse linux enterprise server) with severall VM´s. I just followed the QuickStart. So far I created 2 VM`S (suse1101 and suse1102). I also installed the ambari-server on suse1101. But if I try to add the agent to suse1101 and suse1102 via the installation wizard it fails. There is no errormessage.
========================== Creating target directory... ========================== Command start time 2017-02-13 14:41:25 Connection to suse1101.ambari.apache.org closed. SSH command execution finished host=suse1101.ambari.apache.org, exitcode=0 Command end time 2017-02-13 14:41:26 ========================== Copying common functions script... ========================== Command start time 2017-02-13 14:41:26 scp /usr/lib/python2.6/site-packages/ambari_commons host=suse1101.ambari.apache.org, exitcode=0 Command end time 2017-02-13 14:41:26 ========================== Copying OS type check script... ========================== Command start time 2017-02-13 14:41:26 scp /usr/lib/python2.6/site-packages/ambari_server/os_check_type.py host=suse1101.ambari.apache.org, exitcode=0 Command end time 2017-02-13 14:41:26 ========================== Running OS type check... ========================== Command start time 2017-02-13 14:41:26 Cluster primary/cluster OS family is suse11 and local/current OS family is suse11 Connection to suse1101.ambari.apache.org closed. SSH command execution finished host=suse1101.ambari.apache.org, exitcode=0 Command end time 2017-02-13 14:41:26 ========================== Checking 'sudo' package on remote host... ========================== Command start time 2017-02-13 14:41:26 sudo-1.6.9p17-21.3.1 Connection to suse1101.ambari.apache.org closed. SSH command execution finished host=suse1101.ambari.apache.org, exitcode=0 Command end time 2017-02-13 14:41:26 ========================== Copying repo file to 'tmp' folder... ========================== Command start time 2017-02-13 14:41:26 scp /etc/zypp/repos.d/ambari.repo host=suse1101.ambari.apache.org, exitcode=0 Command end time 2017-02-13 14:41:26 ========================== Moving file to repo dir... ========================== Command start time 2017-02-13 14:41:26 Connection to suse1101.ambari.apache.org closed. SSH command execution finished host=suse1101.ambari.apache.org, exitcode=0 Command end time 2017-02-13 14:41:26 ========================== Changing permissions for ambari.repo... ========================== Command start time 2017-02-13 14:41:26 Connection to suse1101.ambari.apache.org closed. SSH command execution finished host=suse1101.ambari.apache.org, exitcode=0 Command end time 2017-02-13 14:41:26 ========================== Copying setup script file... ========================== Command start time 2017-02-13 14:41:26 scp /usr/lib/python2.6/site-packages/ambari_server/setupAgent.py host=suse1101.ambari.apache.org, exitcode=0 Command end time 2017-02-13 14:41:26 ========================== Running setup agent script... ========================== Command start time 2017-02-13 14:41:26
Important to say is that´ve configured a proxy in /etc/sysconfig/proxy and added the ip to the hosts. I also added the IP adresses of the VM`s etc.
Since I don´t know why this happen I tried to install the agent on suse1101 (this is also the VM where the ambari-server is running) manually. I configured /etc/ambari-agent/conf/ambari-agent.ini like this:
[server] hostname=127.0.0.1 url_port=8440 secured_url_port=8441
If I start the agent I can access several logs:
/var/log/ambari-agent/ambari-agent.log
INFO 2017-02-13 14:33:54,507 main.py:74 - loglevel=logging.INFO INFO 2017-02-13 14:33:54,507 main.py:74 - loglevel=logging.INFO INFO 2017-02-13 14:33:54,508 DataCleaner.py:39 - Data cleanup thread started INFO 2017-02-13 14:33:54,509 DataCleaner.py:120 - Data cleanup started INFO 2017-02-13 14:33:54,509 DataCleaner.py:122 - Data cleanup finished INFO 2017-02-13 14:33:54,529 PingPortListener.py:50 - Ping port listener started on port: 8670 INFO 2017-02-13 14:33:54,531 main.py:289 - Connecting to Ambari server at https://localhost:8440 (127.0.0.1) INFO 2017-02-13 14:33:54,531 NetUtil.py:60 - Connecting to https://localhost:8440/ca INFO 2017-02-13 14:33:54,806 threadpool.py:52 - Started thread pool with 3 core threads and 20 maximum threads WARNING 2017-02-13 14:33:54,806 AlertSchedulerHandler.py:246 - [AlertScheduler] /var/lib/ambari-agent/cache/alerts/definitions.json not found or invalid. No alerts will be scheduled until registration occurs. INFO 2017-02-13 14:33:54,806 AlertSchedulerHandler.py:142 - [AlertScheduler] Starting <ambari_agent.apscheduler.scheduler.Scheduler object at 0xe18850>; currently running: False INFO 2017-02-13 14:33:56,826 hostname.py:89 - Read public hostname 'suse1101.ambari.apache.org' using socket.getfqdn() INFO 2017-02-13 14:33:56,916 ExitHelper.py:53 - Performing cleanup before exiting... INFO 2017-02-13 14:33:56,916 ExitHelper.py:67 - Cleanup finished, exiting with code:0 INFO 2017-02-13 14:47:29,614 main.py:74 - loglevel=logging.INFO INFO 2017-02-13 14:47:29,614 main.py:74 - loglevel=logging.INFO INFO 2017-02-13 14:47:29,615 DataCleaner.py:39 - Data cleanup thread started INFO 2017-02-13 14:47:29,616 DataCleaner.py:120 - Data cleanup started INFO 2017-02-13 14:47:29,616 DataCleaner.py:122 - Data cleanup finished INFO 2017-02-13 14:47:29,638 PingPortListener.py:50 - Ping port listener started on port: 8670 INFO 2017-02-13 14:47:29,640 main.py:289 - Connecting to Ambari server at https://127.0.0.1:8440 (127.0.0.1) INFO 2017-02-13 14:47:29,640 NetUtil.py:60 - Connecting to https://127.0.0.1:8440/ca INFO 2017-02-13 14:47:29,727 threadpool.py:52 - Started thread pool with 3 core threads and 20 maximum threads WARNING 2017-02-13 14:47:29,727 AlertSchedulerHandler.py:246 - [AlertScheduler] /var/lib/ambari-agent/cache/alerts/definitions.json not found or invalid. No alerts will be scheduled until registration occurs. INFO 2017-02-13 14:47:29,727 AlertSchedulerHandler.py:142 - [AlertScheduler] Starting <ambari_agent.apscheduler.scheduler.Scheduler object at 0xe17890>; currently running: False INFO 2017-02-13 14:47:31,731 hostname.py:89 - Read public hostname 'suse1101.ambari.apache.org' using socket.getfqdn() INFO 2017-02-13 14:47:31,831 ExitHelper.py:53 - Performing cleanup before exiting... INFO 2017-02-13 14:47:31,831 ExitHelper.py:67 - Cleanup finished, exiting with code:0
/var/log/ambari-agent/ambari-agent.out
Exception in thread Thread-3: Traceback (most recent call last): File "/usr/lib64/python2.6/threading.py", line 522, in __bootstrap_inner self.run() File "/usr/lib/python2.6/site-packages/ambari_agent/Controller.py", line 374, in run self.register = Register(self.config) File "/usr/lib/python2.6/site-packages/ambari_agent/Register.py", line 34, in __init__ self.hardware = Hardware() File "/usr/lib/python2.6/site-packages/ambari_agent/Hardware.py", line 43, in __init__ self.hardware['mounts'] = Hardware.osdisks() File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk return fn(*args, **kwargs) File "/usr/lib/python2.6/site-packages/ambari_agent/Hardware.py", line 91, in osdisks df = subprocess.Popen(command, stdout=subprocess.PIPE) File "/usr/lib64/python2.6/subprocess.py", line 595, in __init__ errread, errwrite) File "/usr/lib64/python2.6/subprocess.py", line 1106, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory
I don´t know how to fix this, since I´m quite new to linux and ambari..
Created 02-14-2017 02:45 PM
Fixed it by using cent7.0 instead of suse vm´s. Seems like there is something broken with the suse vms.
Created 02-13-2017 02:09 PM
Try to modify ambari-agent.ini and use "hostname=<ambari-server hostname/ipaddress>" instead of 127.0.0.1
Restart ambari agent and let me know if it works.
Make sure Iptables and selinux must be disabled.
Also make sure you must be able to ssh from ambari-server to both ambari-agent nodes passwordless.
Created 02-13-2017 02:25 PM
I got no firewall on this vm. I tried to configure the ambari-agent.ini as follows:
[server] hostname=suse1101.ambari.apache.org url_port=8440 secured_url_port=8441
ambari-agent.log
INFO 2017-02-13 15:18:51,558 main.py:74 - loglevel=logging.INFO INFO 2017-02-13 15:18:51,558 main.py:74 - loglevel=logging.INFO INFO 2017-02-13 15:18:51,559 DataCleaner.py:39 - Data cleanup thread started INFO 2017-02-13 15:18:51,561 DataCleaner.py:120 - Data cleanup started INFO 2017-02-13 15:18:51,561 DataCleaner.py:122 - Data cleanup finished INFO 2017-02-13 15:18:51,579 PingPortListener.py:50 - Ping port listener started on port: 8670 INFO 2017-02-13 15:18:51,580 main.py:289 - Connecting to Ambari server at https://suse1101.ambari.apache.org:8440 (192.168.11.101) INFO 2017-02-13 15:18:51,581 NetUtil.py:60 - Connecting to https://suse1101.ambari.apache.org:8440 INFO 2017-02-13 15:18:51,691 threadpool.py:52 - Started thread pool with 3 core threads and 20 maximum threads WARNING 2017-02-13 15:18:51,691 AlertSchedulerHandler.py:246 - [AlertScheduler] /var/lib/ambari-agent/cache/alerts/definitions.json not found or invalid. No alerts will be scheduled until registration occurs. INFO 2017-02-13 15:18:51,691 AlertSchedulerHandler.py:142 - [AlertScheduler] Starting <ambari_agent.apscheduler.scheduler.Scheduler object at 0xe15950>; currently running: False INFO 2017-02-13 15:18:53,697 hostname.py:89 - Read public hostname 'suse1101.ambari.apache.org' using socket.getfqdn() INFO 2017-02-13 15:18:53,794 ExitHelper.py:53 - Performing cleanup before exiting... INFO 2017-02-13 15:18:53,795 ExitHelper.py:67 - Cleanup finished, exiting with code:0
ambari-agent.out is still the same as mentioned above.
Note that suse1101.ambari.apache.org runs the ambari-server and an agent.
Created 02-13-2017 05:46 PM
what is your Ambari-server system configuration? how many CPUs, RAM size? can you increase "agent.threadpool.size.max" value to say 100 and restart ambari-sever and check?
Created 02-14-2017 09:35 AM
I actually destroyed my vm´s via vagrant and try to install it from scratch again. But I configured vagrant to use 4GB of Ram. I haven´t configured any kind of CPU size since it wasn´t in the vagrantfile.
Created 02-14-2017 02:45 PM
Fixed it by using cent7.0 instead of suse vm´s. Seems like there is something broken with the suse vms.