Created 08-15-2017 04:49 PM
We recently patched Linux servers (RHEL 7.4). As a part of the patching we upgraded openSSL libraries (from 1.0.1e-60.el7_3.1.x86_64 to 1.0.2k-8.el7.x86_64). After completing the process, we saw Heartbeat Lost message in Ambari UI. When I tried to run ambari-agent restart command and got this message in log file:
INFO 2017-08-13 09:04:31,873 NetUtil.py:62 - Connecting to https://servername.com:8440/ca ERROR 2017-08-13 09:04:31,942 NetUtil.py:88 - [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:579) ERROR 2017-08-13 09:04:31,942 NetUtil.py:89 - SSLError: Failed to connect. Please check openssl library versions. Refer to: https://bugzilla.redhat.com/show_bug.cgi?id=1022468 for more details. WARNING 2017-08-13 09:04:31,943 NetUtil.py:116 - Server at https://servername.com:8440 is not reachable, sleeping for 10 seconds...
Prior to patching, we had everything configured properly. We are using Apache Ambari 2.4.2.0.
Is there any compatibility issue with OpenSSL and Ambari?
Thanks,
Darko
Created 08-15-2017 04:52 PM
Your issue looks similar to : https://community.hortonworks.com/questions/120861/ambari-agent-ssl-certificate-verify-failed-certif...
So please check if you using Python version "python-2.7.5" or higher, if yes then you should try to either downgrade the python version to lower than python-2.7.5 as it causes this issue.
[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:579)
(OR)
Else you will need to following the steps
mentioned in the following doc to fix the "certificate verify failed
(_ssl.c" issue while using RHEL7: Controlling and troubleshooting
certificate verification
https://access.redhat.com/articles/2039753#controlling-certificate-verification-7
Created 08-15-2017 04:52 PM
Your issue looks similar to : https://community.hortonworks.com/questions/120861/ambari-agent-ssl-certificate-verify-failed-certif...
So please check if you using Python version "python-2.7.5" or higher, if yes then you should try to either downgrade the python version to lower than python-2.7.5 as it causes this issue.
[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:579)
(OR)
Else you will need to following the steps
mentioned in the following doc to fix the "certificate verify failed
(_ssl.c" issue while using RHEL7: Controlling and troubleshooting
certificate verification
https://access.redhat.com/articles/2039753#controlling-certificate-verification-7
Created 08-16-2017 01:26 PM
Created 08-16-2017 01:23 PM
Thank you for quick response! That was an issue and we were able to fix it.
Created 09-20-2017 02:18 PM
THank you so much @Jay SenSharma. The link solved my problem. I'm posting it here in hoping it could help someone since I've been stuck with this for a day.
modify all nodes
vi /etc/python/cert-verification.cfg [https] verify=disable
Created 05-03-2018 05:37 PM
I have tried the following, but issue not resolved yet. Any other ideas will be helpful.
==========================
Running setup agent script...
==========================
Command start time 2018-05-03 12:22:38
("INFO 2018-05-03 12:17:42,290 HeartbeatHandlers.py:116 - Stop event received
INFO 2018-05-03 12:17:42,290 NetUtil.py:130 - Stop event received
INFO 2018-05-03 12:17:42,290 ExitHelper.py:56 - Performing cleanup before exiting...
INFO 2018-05-03 12:17:42,290 ExitHelper.py:70 - Cleanup finished, exiting with code:0
INFO 2018-05-03 12:17:42,337 main.py:283 - Agent died gracefully, exiting.
INFO 2018-05-03 12:17:42,338 ExitHelper.py:56 - Performing cleanup before exiting...
INFO 2018-05-03 12:22:40,287 main.py:145 - loglevel=logging.INFO
INFO 2018-05-03 12:22:40,288 main.py:145 - loglevel=logging.INFO
INFO 2018-05-03 12:22:40,288 main.py:145 - loglevel=logging.INFO
INFO 2018-05-03 12:22:40,289 DataCleaner.py:39 - Data cleanup thread started
INFO 2018-05-03 12:22:40,291 DataCleaner.py:120 - Data cleanup started
INFO 2018-05-03 12:22:40,291 DataCleaner.py:122 - Data cleanup finished
INFO 2018-05-03 12:22:40,291 hostname.py:67 - agent:hostname_script configuration not defined thus read hostname 'ip-172-31-71-94.ec2.internal' using socket.getfqdn().
INFO 2018-05-03 12:22:40,297 PingPortListener.py:50 - Ping port listener started on port: 8670
INFO 2018-05-03 12:22:40,299 main.py:437 - Connecting to Ambari server at https://ip-172-31-xx-94.ec2.internal:8440 (172.31.71.94)
INFO 2018-05-03 12:22:40,299 NetUtil.py:70 - Connecting to https://ip-172-31-xx-94.ec2.internal:8440/ca
ERROR 2018-05-03 12:22:40,313 NetUtil.py:96 - EOF occurred in violation of protocol (_ssl.c:579)
ERROR 2018-05-03 12:22:40,313 NetUtil.py:97 - SSLError: Failed to connect. Please check openssl library versions.
Refer to: https://bugzilla.redhat.com/show_bug.cgi?id=1022468 for more details.
WARNING 2018-05-03 12:22:40,313 NetUtil.py:124 - Server at https://ip-172-31-xx-94.ec2.internal:8440 is not reachable, sleeping for 10 seconds...
", None)
("INFO 2018-05-03 12:17:42,290 HeartbeatHandlers.py:116 - Stop event received
INFO 2018-05-03 12:17:42,290 NetUtil.py:130 - Stop event received
INFO 2018-05-03 12:17:42,290 ExitHelper.py:56 - Performing cleanup before exiting...
INFO 2018-05-03 12:17:42,290 ExitHelper.py:70 - Cleanup finished, exiting with code:0
INFO 2018-05-03 12:17:42,337 main.py:283 - Agent died gracefully, exiting.
INFO 2018-05-03 12:17:42,338 ExitHelper.py:56 - Performing cleanup before exiting...
INFO 2018-05-03 12:22:40,287 main.py:145 - loglevel=logging.INFO
INFO 2018-05-03 12:22:40,288 main.py:145 - loglevel=logging.INFO
INFO 2018-05-03 12:22:40,288 main.py:145 - loglevel=logging.INFO
INFO 2018-05-03 12:22:40,289 DataCleaner.py:39 - Data cleanup thread started
INFO 2018-05-03 12:22:40,291 DataCleaner.py:120 - Data cleanup started
INFO 2018-05-03 12:22:40,291 DataCleaner.py:122 - Data cleanup finished
INFO 2018-05-03 12:22:40,291 hostname.py:67 - agent:hostname_script configuration not defined thus read hostname 'ip-172-31-71-94.ec2.internal' using socket.getfqdn().
INFO 2018-05-03 12:22:40,297 PingPortListener.py:50 - Ping port listener started on port: 8670
INFO 2018-05-03 12:22:40,299 main.py:437 - Connecting to Ambari server at https://ip-172-31-xx-94.ec2.internal:8440 (172.31.xx.94)
INFO 2018-05-03 12:22:40,299 NetUtil.py:70 - Connecting to https://ip-172-31-xx-94.ec2.internal:8440/ca
ERROR 2018-05-03 12:22:40,313 NetUtil.py:96 - EOF occurred in violation of protocol (_ssl.c:579)
ERROR 2018-05-03 12:22:40,313 NetUtil.py:97 - SSLError: Failed to connect. Please check openssl library versions.
Refer to: https://bugzilla.redhat.com/show_bug.cgi?id=1022468 for more details.
WARNING 2018-05-03 12:22:40,313 NetUtil.py:124 - Server at https://ip-172-31-xx-94.ec2.internal:8440 is not reachable, sleeping for 10 seconds...
", None)
Connection to ip-172-31-71-94.ec2.internal closed.
SSH command execution finished
host=ip-172-31-71-94.ec2.internal, exitcode=0
Command end time 2018-05-03 12:22:43
Registering with the server...
Registration with the server failed.
Created 05-08-2018 08:12 AM
We have over come the problem by adding
following option to security section in ambari-agent.ini in all the hosts in the cluster:
[security] force_https_protocol=PROTOCOL_TLSv1_2
Created 06-05-2018 04:28 PM
Thanks to @bing lv , we have over come this issue by adding below config in [security] section of /etc/ambari-agent/conf/ambari-agent.ini
force_https_protocol=PROTOCOL_TLSv1_2
Created 07-04-2018 10:22 AM
You are welcome
Created 07-19-2018 01:40 PM
Thanks. Solved my problem.
Created 07-17-2018 06:53 AM
[ RESOLVED ]
Gone through same issue only when we are using oVirt Virtualization For our cluster deployment.
Only following solution resolved the problem (Thanks to @bing lv and @Deven Fan:
By adding below config in [security] section of
vi /etc/ambari-agent/conf/ambari-agent.ini force_https_protocol=PROTOCOL_TLSv1_2
vi /etc/python/cert-verification.cfg [https] verify=disable
Created 07-28-2018 02:38 PM
I have the same issues on AWS servers. I'm going through ambari wizard and I always get failed status. In error as usual:
ERROR 2018-07-28 14:12:35,131 NetUtil.py:88 - EOF occurred in violation of protocol (_ssl.c:579) ERROR 2018-07-28 14:12:35,131 NetUtil.py:89 - SSLError: Failed to connect. Please check openssl library versions. Refer to: https://bugzilla.redhat.com/show_bug.cgi?id=1022468 for more details. WARNING 2018-07-28 14:12:35,132 NetUtil.py:116 - Server at https://ip-172-31-0-xx.eu-west-1.compute.internal:8440 is not reachable, sleeping for 10 seconds... ', None) ('WARNING 2018-07-28 14:12:32,307 NetUtil.py:116 - Server at https://ip-172-31-0-xx.eu-west-1.compute.internal:8440 is not reachable, sleeping for 10 seconds... INFO 2018-07-28 14:12:32,307 HeartbeatHandlers.py:115 - Stop event receivedI've tried adding in /etc/python/cert-verification.cfg
[https] verify=disableI've tried adding in /etc/amabri-agent/conf/ambari-agent.in
[security] force_https_protocol=PROTOCOL_TLSv1_2I've restarted agents still the same error 😞 Any ideas? 🙂
Created 07-30-2018 03:19 PM
Hello
I've just add these two line below under security section and it works
[security]
ssl_verify_cert=0
force_https_protocol=PROTOCOL_TLSv1_2
Created 07-31-2018 01:58 PM
Ok for future users 🙂
Check if certyficate is generated by ambari server from one of the nodes:
openssl s_client -connect server_address:8440
corect results (similar):
---Server certificate-----BEGIN CERTIFICATE----- MIIFnDCCA4SgAwIBAgIBATANBgkqhkiG9w0BAQsFADBCMQswCQYDVQQGEwJYWDEV ................. .................
If you are not receiving corect handshake you need to verify ambari-server.ini (ambari server):
vi /etc/ambari-server/conf/ambari.properties
and # the line with TLS cyphers 🙂
,Ok for future users 🙂
Check if certyficate is generated by ambari server from one of the nodes:
openssl s_client -connect server_address:8440
corect results (similar):
---Server certificate-----BEGIN CERTIFICATE----- MIIFnDCCA4SgAwIBAgIBATANBgkqhkiG9w0BAQsFADBCMQswCQYDVQQGEwJYWDEV ................. .................
If you are not receiving corect handshake you need to verify ambari-server.ini (ambari server):
vi /etc/ambari-server/conf/ambari.properties
and # the line with TLS cyphers 🙂