Created 08-15-2017 04:49 PM
We recently patched Linux servers (RHEL 7.4). As a part of the patching we upgraded openSSL libraries (from 1.0.1e-60.el7_3.1.x86_64 to 1.0.2k-8.el7.x86_64). After completing the process, we saw Heartbeat Lost message in Ambari UI. When I tried to run ambari-agent restart command and got this message in log file:
INFO 2017-08-13 09:04:31,873 NetUtil.py:62 - Connecting to https://servername.com:8440/ca ERROR 2017-08-13 09:04:31,942 NetUtil.py:88 - [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:579) ERROR 2017-08-13 09:04:31,942 NetUtil.py:89 - SSLError: Failed to connect. Please check openssl library versions. Refer to: https://bugzilla.redhat.com/show_bug.cgi?id=1022468 for more details. WARNING 2017-08-13 09:04:31,943 NetUtil.py:116 - Server at https://servername.com:8440 is not reachable, sleeping for 10 seconds...
Prior to patching, we had everything configured properly. We are using Apache Ambari 2.4.2.0.
Is there any compatibility issue with OpenSSL and Ambari?
Thanks,
Darko
Created 08-15-2017 04:52 PM
Your issue looks similar to : https://community.hortonworks.com/questions/120861/ambari-agent-ssl-certificate-verify-failed-certif...
So please check if you using Python version "python-2.7.5" or higher, if yes then you should try to either downgrade the python version to lower than python-2.7.5 as it causes this issue.
[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:579)
(OR)
Else you will need to following the steps
mentioned in the following doc to fix the "certificate verify failed
(_ssl.c" issue while using RHEL7: Controlling and troubleshooting
certificate verification
https://access.redhat.com/articles/2039753#controlling-certificate-verification-7
Created 08-15-2017 04:52 PM
Your issue looks similar to : https://community.hortonworks.com/questions/120861/ambari-agent-ssl-certificate-verify-failed-certif...
So please check if you using Python version "python-2.7.5" or higher, if yes then you should try to either downgrade the python version to lower than python-2.7.5 as it causes this issue.
[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:579)
(OR)
Else you will need to following the steps
mentioned in the following doc to fix the "certificate verify failed
(_ssl.c" issue while using RHEL7: Controlling and troubleshooting
certificate verification
https://access.redhat.com/articles/2039753#controlling-certificate-verification-7
Created 08-16-2017 01:26 PM
Created 08-16-2017 01:23 PM
Thank you for quick response! That was an issue and we were able to fix it.
Created 09-20-2017 02:18 PM
THank you so much @Jay SenSharma. The link solved my problem. I'm posting it here in hoping it could help someone since I've been stuck with this for a day.
modify all nodes
vi /etc/python/cert-verification.cfg [https] verify=disable
Created 05-03-2018 05:37 PM
I have tried the following, but issue not resolved yet. Any other ideas will be helpful.
==========================
Running setup agent script...
==========================
Command start time 2018-05-03 12:22:38
("INFO 2018-05-03 12:17:42,290 HeartbeatHandlers.py:116 - Stop event received
INFO 2018-05-03 12:17:42,290 NetUtil.py:130 - Stop event received
INFO 2018-05-03 12:17:42,290 ExitHelper.py:56 - Performing cleanup before exiting...
INFO 2018-05-03 12:17:42,290 ExitHelper.py:70 - Cleanup finished, exiting with code:0
INFO 2018-05-03 12:17:42,337 main.py:283 - Agent died gracefully, exiting.
INFO 2018-05-03 12:17:42,338 ExitHelper.py:56 - Performing cleanup before exiting...
INFO 2018-05-03 12:22:40,287 main.py:145 - loglevel=logging.INFO
INFO 2018-05-03 12:22:40,288 main.py:145 - loglevel=logging.INFO
INFO 2018-05-03 12:22:40,288 main.py:145 - loglevel=logging.INFO
INFO 2018-05-03 12:22:40,289 DataCleaner.py:39 - Data cleanup thread started
INFO 2018-05-03 12:22:40,291 DataCleaner.py:120 - Data cleanup started
INFO 2018-05-03 12:22:40,291 DataCleaner.py:122 - Data cleanup finished
INFO 2018-05-03 12:22:40,291 hostname.py:67 - agent:hostname_script configuration not defined thus read hostname 'ip-172-31-71-94.ec2.internal' using socket.getfqdn().
INFO 2018-05-03 12:22:40,297 PingPortListener.py:50 - Ping port listener started on port: 8670
INFO 2018-05-03 12:22:40,299 main.py:437 - Connecting to Ambari server at https://ip-172-31-xx-94.ec2.internal:8440 (172.31.71.94)
INFO 2018-05-03 12:22:40,299 NetUtil.py:70 - Connecting to https://ip-172-31-xx-94.ec2.internal:8440/ca
ERROR 2018-05-03 12:22:40,313 NetUtil.py:96 - EOF occurred in violation of protocol (_ssl.c:579)
ERROR 2018-05-03 12:22:40,313 NetUtil.py:97 - SSLError: Failed to connect. Please check openssl library versions.
Refer to: https://bugzilla.redhat.com/show_bug.cgi?id=1022468 for more details.
WARNING 2018-05-03 12:22:40,313 NetUtil.py:124 - Server at https://ip-172-31-xx-94.ec2.internal:8440 is not reachable, sleeping for 10 seconds...
", None)
("INFO 2018-05-03 12:17:42,290 HeartbeatHandlers.py:116 - Stop event received
INFO 2018-05-03 12:17:42,290 NetUtil.py:130 - Stop event received
INFO 2018-05-03 12:17:42,290 ExitHelper.py:56 - Performing cleanup before exiting...
INFO 2018-05-03 12:17:42,290 ExitHelper.py:70 - Cleanup finished, exiting with code:0
INFO 2018-05-03 12:17:42,337 main.py:283 - Agent died gracefully, exiting.
INFO 2018-05-03 12:17:42,338 ExitHelper.py:56 - Performing cleanup before exiting...
INFO 2018-05-03 12:22:40,287 main.py:145 - loglevel=logging.INFO
INFO 2018-05-03 12:22:40,288 main.py:145 - loglevel=logging.INFO
INFO 2018-05-03 12:22:40,288 main.py:145 - loglevel=logging.INFO
INFO 2018-05-03 12:22:40,289 DataCleaner.py:39 - Data cleanup thread started
INFO 2018-05-03 12:22:40,291 DataCleaner.py:120 - Data cleanup started
INFO 2018-05-03 12:22:40,291 DataCleaner.py:122 - Data cleanup finished
INFO 2018-05-03 12:22:40,291 hostname.py:67 - agent:hostname_script configuration not defined thus read hostname 'ip-172-31-71-94.ec2.internal' using socket.getfqdn().
INFO 2018-05-03 12:22:40,297 PingPortListener.py:50 - Ping port listener started on port: 8670
INFO 2018-05-03 12:22:40,299 main.py:437 - Connecting to Ambari server at https://ip-172-31-xx-94.ec2.internal:8440 (172.31.xx.94)
INFO 2018-05-03 12:22:40,299 NetUtil.py:70 - Connecting to https://ip-172-31-xx-94.ec2.internal:8440/ca
ERROR 2018-05-03 12:22:40,313 NetUtil.py:96 - EOF occurred in violation of protocol (_ssl.c:579)
ERROR 2018-05-03 12:22:40,313 NetUtil.py:97 - SSLError: Failed to connect. Please check openssl library versions.
Refer to: https://bugzilla.redhat.com/show_bug.cgi?id=1022468 for more details.
WARNING 2018-05-03 12:22:40,313 NetUtil.py:124 - Server at https://ip-172-31-xx-94.ec2.internal:8440 is not reachable, sleeping for 10 seconds...
", None)
Connection to ip-172-31-71-94.ec2.internal closed.
SSH command execution finished
host=ip-172-31-71-94.ec2.internal, exitcode=0
Command end time 2018-05-03 12:22:43
Registering with the server...
Registration with the server failed.
Created 05-08-2018 08:12 AM
We have over come the problem by adding
following option to security section in ambari-agent.ini in all the hosts in the cluster:
[security] force_https_protocol=PROTOCOL_TLSv1_2
Created 06-05-2018 04:28 PM
Thanks to @bing lv , we have over come this issue by adding below config in [security] section of /etc/ambari-agent/conf/ambari-agent.ini
force_https_protocol=PROTOCOL_TLSv1_2
Created 07-04-2018 10:22 AM
You are welcome
Created 07-19-2018 01:40 PM
Thanks. Solved my problem.