Support Questions

Find answers, ask questions, and share your expertise

SmartSense HST Agents Fail To Come Up

avatar
Explorer

Hello All,

I have a 10 node HDP 2.6 (.1.0) cluster, I have installed Hortonworks SmartSense with HST Server on master and HST Agents on all the nodes (master and slave1-9). Its a fresh HDP install on RHEL 7. SmartSense installation went good and after the install I only had HST Agents running on 3 slaves (Slaves 1,2 & 3). Even though I restart all HST Agents, they still are live only on the same 3 slaves (Slaves 1,2 & 3). Can anyone please help me in troubleshooting this issue?

Logs captured for the entire restart operation from /var/log/hst/hst-server.log

  INFO [main] CertificateManager:70 - Initialization of root certificate
  INFO [main] CertificateManager:72 - Certificate exists:true
  INFO [main] Configuration:562 - Reading password from existing file
  WARN [main] ConfigChangeListener:155 - Creating a patch
  INFO [main] ConfigChangeListener:236 - Patch created : /var/lib/smartsense/hst-server/updates/upload/config-update.tgz
  INFO [main] SupportToolServer:572 - Bundle Purge Scheduler enabled at :Thu Oct 19 13:30:21 EDT 2017. Bundle Purge job will run every 24 hrs.
  INFO [main] Server:266 - jetty-7.6.7.v20120910
  INFO [main] ContextHandler:744 - started o.e.j.s.ServletContextHandler{/,file:/usr/hdp/share/hst/hst-server/web/}
  INFO [main] AbstractConnector:338 - Started SelectChannelConnector@0.0.0.0:9000
  INFO [main] Server:266 - jetty-7.6.7.v20120910
  INFO [main] ContextHandler:744 - started o.e.j.s.ServletContextHandler{/,null}
  INFO [main] SslContextFactory:300 - Enabled Protocols [SSLv2Hello, TLSv1, TLSv1.1, TLSv1.2] of [SSLv2Hello, SSLv3, TLSv1, TLSv1.1, TLSv1.2]
  INFO [main] AbstractConnector:338 - Started SslSelectChannelConnector@0.0.0.0:9440
  INFO [main] SslContextFactory:300 - Enabled Protocols [SSLv2Hello, TLSv1, TLSv1.1, TLSv1.2] of [SSLv2Hello, SSLv3, TLSv1, TLSv1.1, TLSv1.2]
  INFO [main] AbstractConnector:338 - Started SslSelectChannelConnector@0.0.0.0:9441
  WARN [qtp943219925-85] nio:651 - javax.net.ssl.SSLException: Received fatal alert: unknown_ca
  WARN [qtp943219925-86] nio:651 - javax.net.ssl.SSLException: Received fatal alert: unknown_ca
  INFO [qtp943219925-85] SupportToolResource:142 - Unregistering agent slave3
  INFO [qtp943219925-86] SupportToolResource:142 - Unregistering agent slave2
  WARN [qtp943219925-87] nio:651 - javax.net.ssl.SSLException: Received fatal alert: unknown_ca
  WARN [qtp943219925-87] nio:651 - javax.net.ssl.SSLException: Received fatal alert: unknown_ca
  INFO [qtp943219925-85] CertificateManager:189 - Signing of agent certificate
  INFO [qtp943219925-85] CertificateManager:190 - Verifying passphrase
  INFO [qtp943219925-85] Configuration:562 - Reading password from existing file
  INFO [qtp943219925-85] CertificateManager:214 - Revoking of slave3 certificate.
  WARN [qtp943219925-87] nio:651 - javax.net.ssl.SSLException: Received fatal alert: unknown_ca
  INFO [qtp943219925-85] CertificateManager:265 - Command openssl ca -config /var/lib/smartsense/hst-server/keys/ca.config -keyfile /var/lib/smartsense/hst-server/keys/ca.key -revoke /var/lib/smartsense/hst-server/keys/slave3.crt -batch -passin pass:**** -cert /var/lib/smartsense/hst-server/keys/ca.crt was finished with exit code: 0 - the operation was completely successfully.
  WARN [qtp943219925-87] nio:651 - javax.net.ssl.SSLException: Received fatal alert: unknown_ca
  WARN [qtp943219925-88] nio:651 - javax.net.ssl.SSLException: Received fatal alert: unknown_ca
  INFO [qtp943219925-85] CertificateManager:265 - Command openssl ca -config /var/lib/smartsense/hst-server/keys/ca.config -in /var/lib/smartsense/hst-server/keys/slave3.csr -out /var/lib/smartsense/hst-server/keys/slave3.crt -batch -md sha256 -passin pass:**** -keyfile /var/lib/smartsense/hst-server/keys/ca.key -cert /var/lib/smartsense/hst-server/keys/ca.crt was finished with exit code: 0 - the operation was completely successfully.
  INFO [qtp943219925-89] SupportToolResource:142 - Unregistering agent slave1
  WARN [qtp943219925-87] nio:651 - javax.net.ssl.SSLException: Received fatal alert: unknown_ca
  WARN [qtp943219925-90] nio:651 - javax.net.ssl.SSLException: Received fatal alert: unknown_ca
  WARN [qtp943219925-85] nio:651 - javax.net.ssl.SSLException: Received fatal alert: unknown_ca
  INFO [qtp943219925-90] SupportToolResource:115 - Registering agent, id=slave3, version=1.4.0.2.5.0.3-7
  WARN [qtp943219925-90] nio:651 - javax.net.ssl.SSLException: Received fatal alert: unknown_ca
  WARN [qtp943219925-88] nio:651 - javax.net.ssl.SSLException: Received fatal alert: unknown_ca
  WARN [qtp943219925-90] nio:651 - javax.net.ssl.SSLException: Received fatal alert: unknown_ca
  WARN [qtp943219925-86] nio:651 - javax.net.ssl.SSLException: Received fatal alert: unknown_ca
  INFO [Thread-3] FileWatcher:232 - Watcher configuration has been changed. re-initializing watcher.
  INFO [Thread-3] ConfigChangeListener:131 - listner configuration has been changed. re-initializing listner.
  WARN [qtp943219925-87] nio:651 - javax.net.ssl.SSLException: Received fatal alert: unknown_ca
  WARN [qtp943219925-87] nio:651 - javax.net.ssl.SSLException: Received fatal alert: unknown_ca
  WARN [qtp943219925-87] nio:651 - javax.net.ssl.SSLException: Received fatal alert: unknown_ca
  WARN [qtp943219925-88] nio:651 - javax.net.ssl.SSLException: Received fatal alert: unknown_ca
1 ACCEPTED SOLUTION

avatar
Master Mentor

@Bharath A

Please check the output of the following command:

  rpm -qa python-devel


If you see the version of this package is 2.7.5-58 then you should downgrade to 2.7.5-48.

As an alternate solution please refer to: https://access.redhat.com/articles/2039753#controlling-certificate-verification-7

Or try to do the following as suggested in https://community.hortonworks.com/questions/120861/ambari-agent-ssl-certificate-verify-failed-certif...

sed -i 's/verify=platform_default/verify=disable/' /etc/python/cert-verification.cfg

.

View solution in original post

3 REPLIES 3

avatar
Explorer

Can you please give me some directions/your inputs, @Artem Ervits @Jay SenSharma @Shu @Aditya Sirna @Abdelkrim Hadjidj @Dinesh Chitlangia. Please need your help.

avatar
Master Mentor

@Bharath A

Please check the output of the following command:

  rpm -qa python-devel


If you see the version of this package is 2.7.5-58 then you should downgrade to 2.7.5-48.

As an alternate solution please refer to: https://access.redhat.com/articles/2039753#controlling-certificate-verification-7

Or try to do the following as suggested in https://community.hortonworks.com/questions/120861/ambari-agent-ssl-certificate-verify-failed-certif...

sed -i 's/verify=platform_default/verify=disable/' /etc/python/cert-verification.cfg

.

avatar
Master Mentor

@Bharath A

If above does not solve the issue then please try the following:

1. Check the hostname resolution on problematic hosts ?

# cat /etc/hosts
# hostname -

2. Try to delete/rename the "/usr/hdp/share/hst/hst-agent/keys" directory and then re-register hst agent as following:

# cd /usr/hdp/share/hst/hst-agent/
# mv keys keys_OLD


3. From ambari UI:

Ambari --> Hosts (Tab) --> Click on the problematic hostname --> SmartSense HST Agent -->  Register


4 Verify if the keys are generated fine:

# ls /usr/hdp/share/hst/hst-agent/


5 Now Start SmartSense HST Agent from Ambari

.