About jason_breitweg

Kartik_Agarwal · ‎09-23-2021

make sure that the Ambari Server trusts the certificate that the LDAP server is using. One quick way to get that certificate directly is to use openssl to retrieve that certificate from the LDAP server, and then explicitly add it to a new keystore: $ openssl s_client -showcerts -connect ldapserver.domain.com:636 You'll see the certificate printed in STDOUT, just look for BEGIN CERTIFICATE. You will need to grab the entire certificate including the ----BEGIN and END ---- text, and save it to a file. In this case we'll call it ldap.cert. Once this has been done you can follow 1.2.(1-3) steps in the doc to create a new JKS keystore and import that certificate to ensure that it's trusted by Ambari: http://docs.hortonworks.com/HDPDocuments/Ambari-2.1.2.1/bk_Ambari_Security_Guide/content/_configure_... Now you've got a JKS keystore with that certificate in it, you can tell Ambari to use that when connecting to your LDAP server using SSL by re-running the ambari-server setup-ldap. Just make sure you answer correctly for: Use SSL=true TrustStore type=jks Path to TrustStore file=/etc/ambari-server/keys/ldaps-keystore.jks Password for TrustStore={{ what you typed in step 1.2.3 }}

ArpitAgarwal · ‎10-13-2017

@Dr. Jason Breitweg, it will not be deleted automatically. There may be block files under that directory that you need. If the cluster has any important data - I'd recommend running 'hdfs fsck' to ensure there are no missing/corrupt blocks before you delete /var/hadoop/hdfs/data/current/BP-*. Even then I'd first move the directory to a different location, restart DataNodes and rerun fsck to ensure you don't cause data loss.

rlevas · ‎06-23-2017

Yikes... it appears that I had an error in the JAAS config that I posted. It was a typo on my part. However, I am glad you found the issue and fixed it. I accidentally had useKeyTab=false where the proper value was supposed to be useKeyTab=true My apologies.

moises_c · ‎07-02-2019

It works in HDP-3.1.0.0 and python 3.7. Thanks! you've saved my day

jason_breitweg · ‎02-09-2016

The cluster has 1 management node (Bright Cluster Manager and Ambari server), 2 NameNodes (1 active, 1 passive) and 17 DataNodes and is running Hortonworks HDP 2.3.2 and Ambari 2.1.2. Each node has a 2 10GbE NICs which are bonded together and jumbo frames (MTU=9000) is enabled on the interfaces. There are sporadic NodeManager Web UI alerts in Ambari. For all 17 DataNodes we get connection timeouts throughout the day. These timeouts are not correlated with any sort of load on the system, they happen no matter what. When the connection to port 8042 is successful the connection is is around 5-7ms but when the connections fails I get response times of 5 seconds. Never 3 seconds or 6 seconds, always 5 seconds. For example... [root@XXXX ~]# python2.7 YARN_response.py Testing response time at http://XXXX:8042 Output is written if http response is > 1 second.Press Ctrl-C to exit! 2016-02-08 07:19:17.877947 Host: XX23:8042 conntime - 5.0073 seconds, HTTP response - 200 2016-02-08 07:19:22.889430 Host: XX25:8042 conntime - 5.0078 seconds, HTTP response - 200 2016-02-08 07:19:48.466520 Host: XX15:8042 conntime - 5.0071 seconds, HTTP response - 200 2016-02-08 07:20:24.423817 Host: XX15:8042 conntime - 5.0073 seconds, HTTP response - 200 2016-02-08 07:20:29.449196 Host: XX23:8042 conntime - 5.0073 seconds, HTTP response - 200 2016-02-08 07:21:00.190991 Host: XX19:8042 conntime - 5.0077 seconds, HTTP response - 200 2016-02-08 07:21:05.210073 Host: XX24:8042 conntime - 5.0073 seconds, HTTP response - 200 2016-02-08 07:21:28.738996 Host: XX17:8042 conntime - 5.0078 seconds, HTTP response - 200 2016-02-08 07:21:33.747728 Host: XX18:8042 conntime - 5.0086 seconds, HTTP response - 200 2016-02-08 07:21:38.764546 Host: XX22:8042 conntime - 5.0075 seconds, HTTP response - 200 If I let the script run long enough then every DataNode will eventually turn up. It turns out that this is a DNS issue and the solution is to put options single-request in /etc/resolv.conf on all nodes. This option is described in the man page as such: single-request (since glibc 2.10) Sets RES_SNGLKUP in _res.options.Bydefault, glibc performs IPv4andIPv6 lookups in parallel since version 2.9.Some appliance DNS servers cannot handle these queries properly and make the requests time out.This option disables the behavior and makes glibc perform the IPv6andIPv4 requests sequentially (at the cost of some slowdown of the resolving process). Cluster performance is now as expected.

nsabharwal · ‎02-09-2016

@Jason Breitweg I have accepted your answers. Could you publish an article based on this? This is really helpful.

Online	Offline
Last Visited	‎09-04-2019 01:31 PM

Member Since	‎02-08-2016 11:56 AM
Last Visited	‎09-04-2019 01:31 PM
Posts	12
Kudos received	10

Cloudera Community

Re: NodeManager Web UI connection timeouts; always...

Re: How to update Ambari truststore and Ranger key...

Re: Delete old BP-* DataNode directories by hand?

Re: OpenTSDB with Kerberos: Cannot renew TGT with ...

Re: Use of Python version 3 scripts for pyspark wi...

NodeManager Web UI connection timeouts; always 5 s...

Re: NodeManager Web UI connection timeouts; always...