Created 11-16-2018 10:03 AM
1、When I add a new host into the cluster,I met the below errors. As you see, there are no errors in the log!
2、There are no errors in ambari-server.log and ambari-agent.log
3、I tried to add other hosts,all of them are failed.
4、firewall、iptables、selinux all disabled. I use RHEL7.3 and Python2.7.5
5、When I add this physical machine to a cluster which the Ambari server installed in a virtual machine, it can be registered successfully, but when I add it to a cluster which the Ambari server installed in a physical machine,it's configuration also like the below info,I encountered the failure.
6、the physical machine configuration:
CPUArchitecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 112 On-line CPU(s) list: 0-111 Thread(s) per core: 2 Core(s) per socket: 14 Socket(s): 4 NUMA node(s): 4 Vendor ID: GenuineIntel CPU family: 6 Model: 79 Model name: Intel(R) Xeon(R) CPU E7-4830 v4 @ 2.00GHz Stepping: 1 CPU MHz: 1995.140 BogoMIPS: 3996.78 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 35840K NUMA node0 CPU(s): 0-13,56-69 NUMA node1 CPU(s): 14-27,70-83 NUMA node2 CPU(s): 28-41,84-97 NUMA node3 CPU(s): 42-55,98-111Mem
free -g total used free shared buff/cache available Mem: 503 4 497 0 1 498 Swap: 31 0 31
7、the virtual machine configuration:
CPUArchitecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s): 16 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 42 Model name: Intel Xeon E312xx (Sandy Bridge) Stepping: 1 CPU MHz: 2199.998 BogoMIPS: 4399.99 Virtualization: VT-x Hypervisor vendor: KVM Virtualization type: full L1d cache: 32K L1i cache: 32K L2 cache: 4096K L3 cache: 16384K NUMA node0 CPU(s): 0-15Mem
free -g total used free shared buff/cache available Mem: 31 6 9 0 14 23 Swap: 3 0 3
8、ambari-agent.log
========================== Creating target directory... ========================== Command start time 2018-11-08 08:54:23 Connection to server1 closed. SSH command execution finished host=server1, exitcode=0 Command end time 2018-11-08 08:54:23 ========================== Copying ambari sudo script... ========================== Command start time 2018-11-08 08:54:23 scp /var/lib/ambari-server/ambari-sudo.sh host=server1, exitcode=0 Command end time 2018-11-08 08:54:23 ========================== Copying common functions script... ========================== Command start time 2018-11-08 08:54:23 scp /usr/lib/python2.6/site-packages/ambari_commons host=server1, exitcode=0 Command end time 2018-11-08 08:54:23 ========================== Copying OS type check script... ========================== Command start time 2018-11-08 08:54:23 scp /usr/lib/python2.6/site-packages/ambari_server/os_check_type.py host=server1, exitcode=0 Command end time 2018-11-08 08:54:23 ========================== Running OS type check... ========================== Command start time 2018-11-08 08:54:23 Cluster primary/cluster OS family is redhat7 and local/current OS family is redhat7 Connection to server1 closed. SSH command execution finished host=server1, exitcode=0 Command end time 2018-11-08 08:54:23 ========================== Checking 'sudo' package on remote host... ========================== Command start time 2018-11-08 08:54:23 Connection to server1 closed. SSH command execution finished host=server1, exitcode=0 Command end time 2018-11-08 08:54:23 ========================== Copying repo file to 'tmp' folder... ========================== Command start time 2018-11-08 08:54:23 scp /etc/yum.repos.d/ambari.repo host=server1, exitcode=0 Command end time 2018-11-08 08:54:23 ========================== Moving file to repo dir... ========================== Command start time 2018-11-08 08:54:23 Connection to server1 closed. SSH command execution finished host=server1, exitcode=0 Command end time 2018-11-08 08:54:24 ========================== Changing permissions for ambari.repo... ========================== Command start time 2018-11-08 08:54:24 Connection to server1 closed. SSH command execution finished host=server1, exitcode=0 Command end time 2018-11-08 08:54:24 ========================== Copying setup script file... ========================== Command start time 2018-11-08 08:54:24 scp /usr/lib/python2.6/site-packages/ambari_server/setupAgent.py host=server1, exitcode=0 Command end time 2018-11-08 08:54:24 ========================== Running setup agent script... ========================== Command start time 2018-11-08 08:54:24 ('INFO 2018-11-08 08:53:32,858 DataCleaner.py:120 - Data cleanup started INFO 2018-11-08 08:53:32,860 DataCleaner.py:122 - Data cleanup finished INFO 2018-11-08 08:53:32,909 PingPortListener.py:50 - Ping port listener started on port: 8670 INFO 2018-11-08 08:53:32,911 main.py:349 - Connecting to Ambari server at https://server1:8440 (10.212.155.20) INFO 2018-11-08 08:53:32,911 NetUtil.py:62 - Connecting to https://server1:8440/ca INFO 2018-11-08 08:54:26,046 main.py:90 - loglevel=logging.INFO INFO 2018-11-08 08:54:26,046 main.py:90 - loglevel=logging.INFO INFO 2018-11-08 08:54:26,046 main.py:90 - loglevel=logging.INFO INFO 2018-11-08 08:54:26,052 HeartbeatHandlers.py:83 - Ambari-agent received 15 signal, stopping... INFO 2018-11-08 08:54:56,162 main.py:226 - Agent not going to die gracefully, going to execute kill -9 INFO 2018-11-08 08:54:56,170 ExitHelper.py:53 - Performing cleanup before exiting... INFO 2018-11-08 08:54:56,604 main.py:90 - loglevel=logging.INFO INFO 2018-11-08 08:54:56,604 main.py:90 - loglevel=logging.INFO INFO 2018-11-08 08:54:56,604 main.py:90 - loglevel=logging.INFO INFO 2018-11-08 08:54:56,606 DataCleaner.py:39 - Data cleanup thread started INFO 2018-11-08 08:54:56,608 DataCleaner.py:120 - Data cleanup started INFO 2018-11-08 08:54:56,608 DataCleaner.py:122 - Data cleanup finished INFO 2018-11-08 08:54:56,660 PingPortListener.py:50 - Ping port listener started on port: 8670 INFO 2018-11-08 08:54:56,662 main.py:349 - Connecting to Ambari server at https://server1:8440 (10.212.155.20) INFO 2018-11-08 08:54:56,662 NetUtil.py:62 - Connecting to https://server1:8440/ca ', None) ('INFO 2018-11-08 08:53:32,858 DataCleaner.py:120 - Data cleanup started INFO 2018-11-08 08:53:32,860 DataCleaner.py:122 - Data cleanup finished INFO 2018-11-08 08:53:32,909 PingPortListener.py:50 - Ping port listener started on port: 8670 INFO 2018-11-08 08:53:32,911 main.py:349 - Connecting to Ambari server at https://server1:8440 (10.212.155.20) INFO 2018-11-08 08:53:32,911 NetUtil.py:62 - Connecting to https://server1:8440/ca INFO 2018-11-08 08:54:26,046 main.py:90 - loglevel=logging.INFO INFO 2018-11-08 08:54:26,046 main.py:90 - loglevel=logging.INFO INFO 2018-11-08 08:54:26,046 main.py:90 - loglevel=logging.INFO INFO 2018-11-08 08:54:26,052 HeartbeatHandlers.py:83 - Ambari-agent received 15 signal, stopping... INFO 2018-11-08 08:54:56,162 main.py:226 - Agent not going to die gracefully, going to execute kill -9 INFO 2018-11-08 08:54:56,170 ExitHelper.py:53 - Performing cleanup before exiting... INFO 2018-11-08 08:54:56,604 main.py:90 - loglevel=logging.INFO INFO 2018-11-08 08:54:56,604 main.py:90 - loglevel=logging.INFO INFO 2018-11-08 08:54:56,604 main.py:90 - loglevel=logging.INFO INFO 2018-11-08 08:54:56,606 DataCleaner.py:39 - Data cleanup thread started INFO 2018-11-08 08:54:56,608 DataCleaner.py:120 - Data cleanup started INFO 2018-11-08 08:54:56,608 DataCleaner.py:122 - Data cleanup finished INFO 2018-11-08 08:54:56,660 PingPortListener.py:50 - Ping port listener started on port: 8670 INFO 2018-11-08 08:54:56,662 main.py:349 - Connecting to Ambari server at https://server1:8440 (10.212.155.20) INFO 2018-11-08 08:54:56,662 NetUtil.py:62 - Connecting to https://server1:8440/ca ', None) Connection to server1 closed. SSH command execution finished host=server1, exitcode=0 Command end time 2018-11-08 08:54:59 Registering with the server... Registration with the server failed.
If someone has the same problem, I hope you can help me.thank you very much!
Created 11-16-2018 10:07 AM
We see that your agent log repeatedly shows the following line (saying "Connecting to") But it never shows "Connected"
NetUtil.py:62 - Connecting to https://server1:8440/ca
But ideally immediately after the above line it should also show "Connected to Ambari server server1" the following line which is not happening.
NetUtil.py:62 - Connecting to https://server1:8440/ca main.py:447 - Connected to Ambari server server1
So please check from the Ambari Agent Host if it is able to resolve the "server1" ?
Please try running the following command to verify Ambari Server access from the client machine.
1. Please check if the port 8440 is accessible from ambari server host.
# telnet server1 8440 (OR) # nc -v server1 8440
2. Check if ambari server has firewall disabled.
3. From Agent host please check if you are able to access ambari certs?
# openssl s_client -connect server1:8440
4. Verify if the correct SSL protocol is being used in the above output.
Just in case you see any error related to Agent SSL then please refer to the following Article:
.
Created 11-19-2018 12:44 AM
1.Actually the ambari server and the agent are on the same machine,so it must can resolve the name server1
2.telnet tested show the port 8840 can be connected
3.The service firewall has been disabled,I did it
4.I've added openssl configuration in the file ambari-agent.ini but it also failed for the same reason.It never showed "connected to server1".
Created 12-03-2018 07:40 AM
agent.threadpool.size.max=120
it works