Created 07-27-2018 10:02 AM
hi All
Muthu here. I installedHDP 2.6.3 yesterday. When I started the server today ambari could not able to get heart beat from services..I could not able to start any services.
[infa@bdm ~]$ ll /usr/hdp/current/
lrwxrwxrwx. 1 root root 29 Jul 26 20:44 accumulo-client -> /usr/hdp/2.6.3.0-235/accumulo
lrwxrwxrwx. 1 root root 29 Jul 26 20:44 accumulo-gc -> /usr/hdp/2.6.3.0-235/accumulo
When I checked ambary-agent log, it shows below error
INFO 2018-07-27 12:44:34,974 NetUtil.py:70 - Connecting to https://bdm.localdomain:8440/ca
ERROR 2018-07-27 12:44:34,977 NetUtil.py:96 - EOF occurred in violation of protocol (_ssl.c:579)
ERROR 2018-07-27 12:44:34,977 NetUtil.py:97 - SSLError: Failed to connect. Please check openssl library versions.
As per HDP 2.6 installation guide , Open SSL should be v 1.01 or later
But the version I have in RHEL 7.2 is
[infa@bdm ~]$ openssl version
OpenSSL 1.0.2k-fips26 Jan 2017
[infa@bdm ~]$
As per my understanding, The server has latest version than what is required. Any clue on What I missing here?
Thanks in advance
Muthu
Created 07-27-2018 10:12 AM
Try adding force_https_protocol=PROTOCOL_TLSv1_2 into /etc/ambari-agent/conf/ambari-agent.ini
This KB cover the likely root cause of your error: https://community.hortonworks.com/articles/188269/javapython-updates-and-ambari-agent-tls-settings.h...
Created 07-27-2018 10:12 AM
Try adding force_https_protocol=PROTOCOL_TLSv1_2 into /etc/ambari-agent/conf/ambari-agent.ini
This KB cover the likely root cause of your error: https://community.hortonworks.com/articles/188269/javapython-updates-and-ambari-agent-tls-settings.h...
Created 07-31-2018 10:08 AM
Hi Jonathan
As suggested ,i have made the changes in ambari-agent.ini
force_https_protocol=PROTOCOL_TLSv1_2
The name node service is not coming up and throwing the below error.
Any suggestions. thanks in advance.
tail -f /var/lib/ambari-agent/data/output-266.txt
stdout:
2018-07-31 15:27:33,705 - Retrying after 10 seconds. Reason: Execution of '/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs hdfs://bdm.localdomain:8020 -safemode get | grep 'Safe mode is OFF'' returned 1.
2018-07-31 15:27:46,236 - Retrying after 10 seconds. Reason: Execution of '/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs hdfs://bdm.localdomain:8020 -safemode get | grep 'Safe mode is OFF'' returned 1.
2018-07-31 15:27:59,073 - Retrying after 10 seconds. Reason: Execution of '/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs hdfs://bdm.localdomain:8020 -safemode get | grep 'Safe mode is OFF'' returned 1.
Created 07-31-2018 01:40 PM
The latest update from my end,
What I observed in my cluster is start failed during the process. the core issues are .
1. NN is coming up after 25 minutes in safe mode.
2. Since the NN is coming up in safe mode, the yarn timeline server is not able to comeup.. and aborting the startup process.
3. Once i manually issue the "hdfs dfsadmin -safemode leave". i can able to start other services including yarn timeline services manually except hbase master.
Any suggestions
Thx
Muthu
Created 08-03-2018 07:59 AM
Hi Jonathan
Finally Issue is resolved now. Initially, I tried hdp 2.6 in RHEL 7.2 it gave me the error. So i tried using RHEL 7.4
Hope its due to compatibility issues. everything is fine in RHEL 7.4
One quick question to you. Is compatibility matrix link available for partners in community site? When I checked it asked for customer login or employee login. Any sugestions.
Thx
Muthu
Created 07-27-2018 10:17 AM
You can easily downgrade the OpenSSL version by using the following steps.
cd /usr/local/src/ wget https://www.openssl.org/source/old/1.0.1/openssl-1.0.1k.tar.gz tar -xvf /usr/local/src/openssl-1.0.1k.tar.gz cd /usr/local/src/openssl-1.0.1k ./config --prefix=/usr/local/ --openssldir=/usr/local/openssl make make test make install mv /usr/bin/openssl /usr/bin/openssl-bak cp -p /usr/local/openssl/bin/openssl /usr/bin/opensslor cp -p /usr/local/ssl/bin/openssl /usr/bin/openssl ll -ld /usr/bin/openssl openssl version
Created 07-31-2018 11:45 AM
Hi Geoffrey
Before i try this in my RHEL, i just tested the same in my centos VM.
make test step failed with below error. Do i miss anything here ?
thx
Muthu
MS consistency test /bin/perl cms-test.pl CMS => PKCS#7 compatibility tests signed content DER format, RSA key: verify error make[1]: *** [test_cms] Error 1 make[1]: Leaving directory `/usr/local/src/openssl-1.0.1k/test' make: *** [tests] Error 2 [root@spark openssl-1.0.1k]#