Support Questions

Find answers, ask questions, and share your expertise

HDP Startup Issue

avatar
Explorer

hi All

Muthu here. I installedHDP 2.6.3 yesterday. When I started the server today ambari could not able to get heart beat from services..I could not able to start any services.

[infa@bdm ~]$ ll /usr/hdp/current/

lrwxrwxrwx. 1 root root 29 Jul 26 20:44 accumulo-client -> /usr/hdp/2.6.3.0-235/accumulo

lrwxrwxrwx. 1 root root 29 Jul 26 20:44 accumulo-gc -> /usr/hdp/2.6.3.0-235/accumulo

When I checked ambary-agent log, it shows below error

INFO 2018-07-27 12:44:34,974 NetUtil.py:70 - Connecting to https://bdm.localdomain:8440/ca

ERROR 2018-07-27 12:44:34,977 NetUtil.py:96 - EOF occurred in violation of protocol (_ssl.c:579)

ERROR 2018-07-27 12:44:34,977 NetUtil.py:97 - SSLError: Failed to connect. Please check openssl library versions.

As per HDP 2.6 installation guide , Open SSL should be v 1.01 or later

But the version I have in RHEL 7.2 is

[infa@bdm ~]$ openssl version

OpenSSL 1.0.2k-fips26 Jan 2017

[infa@bdm ~]$

As per my understanding, The server has latest version than what is required. Any clue on What I missing here?

Thanks in advance

Muthu


screenshot-155.png
1 ACCEPTED SOLUTION

avatar

Hi @Muthukumar Somasundaram

Try adding force_https_protocol=PROTOCOL_TLSv1_2 into /etc/ambari-agent/conf/ambari-agent.ini

This KB cover the likely root cause of your error: https://community.hortonworks.com/articles/188269/javapython-updates-and-ambari-agent-tls-settings.h...

View solution in original post

6 REPLIES 6

avatar

Hi @Muthukumar Somasundaram

Try adding force_https_protocol=PROTOCOL_TLSv1_2 into /etc/ambari-agent/conf/ambari-agent.ini

This KB cover the likely root cause of your error: https://community.hortonworks.com/articles/188269/javapython-updates-and-ambari-agent-tls-settings.h...

avatar
Explorer

Hi Jonathan

As suggested ,i have made the changes in ambari-agent.ini

force_https_protocol=PROTOCOL_TLSv1_2

The name node service is not coming up and throwing the below error.

Any suggestions. thanks in advance.

tail -f /var/lib/ambari-agent/data/output-266.txt

stdout:

2018-07-31 15:27:33,705 - Retrying after 10 seconds. Reason: Execution of '/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs hdfs://bdm.localdomain:8020 -safemode get | grep 'Safe mode is OFF'' returned 1.

2018-07-31 15:27:46,236 - Retrying after 10 seconds. Reason: Execution of '/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs hdfs://bdm.localdomain:8020 -safemode get | grep 'Safe mode is OFF'' returned 1.

2018-07-31 15:27:59,073 - Retrying after 10 seconds. Reason: Execution of '/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs hdfs://bdm.localdomain:8020 -safemode get | grep 'Safe mode is OFF'' returned 1.

avatar
Explorer

The latest update from my end,

What I observed in my cluster is start failed during the process. the core issues are .

1. NN is coming up after 25 minutes in safe mode.

2. Since the NN is coming up in safe mode, the yarn timeline server is not able to comeup.. and aborting the startup process.

3. Once i manually issue the "hdfs dfsadmin -safemode leave". i can able to start other services including yarn timeline services manually except hbase master.

Any suggestions

Thx

Muthu

avatar
Explorer

Hi Jonathan

Finally Issue is resolved now. Initially, I tried hdp 2.6 in RHEL 7.2 it gave me the error. So i tried using RHEL 7.4

Hope its due to compatibility issues. everything is fine in RHEL 7.4

One quick question to you. Is compatibility matrix link available for partners in community site? When I checked it asked for customer login or employee login. Any sugestions.

Thx

Muthu

avatar
Master Mentor

@Muthukumar Somasundaram

You can easily downgrade the OpenSSL version by using the following steps.

cd /usr/local/src/
wget https://www.openssl.org/source/old/1.0.1/openssl-1.0.1k.tar.gz
tar -xvf /usr/local/src/openssl-1.0.1k.tar.gz
cd /usr/local/src/openssl-1.0.1k
./config --prefix=/usr/local/ --openssldir=/usr/local/openssl
make
make test
make install
mv /usr/bin/openssl /usr/bin/openssl-bak
cp -p /usr/local/openssl/bin/openssl /usr/bin/opensslor
cp -p /usr/local/ssl/bin/openssl /usr/bin/openssl
ll  -ld /usr/bin/openssl
openssl version


avatar
Explorer

Hi Geoffrey

Before i try this in my RHEL, i just tested the same in my centos VM.

make test step failed with below error. Do i miss anything here ?

thx

Muthu

MS consistency test /bin/perl cms-test.pl CMS => PKCS#7 compatibility tests signed content DER format, RSA key: verify error make[1]: *** [test_cms] Error 1 make[1]: Leaving directory `/usr/local/src/openssl-1.0.1k/test' make: *** [tests] Error 2 [root@spark openssl-1.0.1k]#