Support Questions

Find answers, ask questions, and share your expertise

HDP-2.5.3: Error in failure to connect to Ambari server during install attempt

avatar
Explorer

In a couple attempts to create an HDP 2.5.3 cluster on Power Linux RHEL 7.2, we've seen an error with Ambari-agents connecting to the Ambari server. We have succeeded in creating other clusters with the same versions, including same requisites like openssl versions, openjdk versions, etc. Yet, while many install fine, there is some scenario that prevents success and we have been unable to pin it down.

The error is very similar to another reported question, here:

https://community.hortonworks.com/questions/24208/help-ambari-agent-registe-fail-with-netutilpy77-er...

However, that one discusses use of Oracle JDK and is on X86 platform. Our case is the same error but on ppc64le. Oracle JDK isn't supported, but OpenJDK 1.8 works (at least on other clusters).

Here is some package version output of interest: [root@st03d ambari-agent]# rpm -qa |grep ambari ambari-agent-2.4.2.4-3.ppc64le ambari-server-2.4.2.4-3.ppc64le [root@st03d ambari-agent]# rpm -qa |grep openjdk |grep 1.8 java-1.8.0-openjdk-headless-1.8.0.65-3.b17.el7.ppc64le java-1.8.0-openjdk-1.8.0.65-3.b17.el7.ppc64le java-1.8.0-openjdk-devel-1.8.0.65-3.b17.el7.ppc64le [root@st03d ambari-agent]# echo $JAVA_HOME /usr/lib/jvm/java-1.8.0-openjdk [root@st03d ambari-agent]# rpm -qa |grep openssl openssl-1.0.1e-51.el7_2.7.ppc64le openssl-libs-1.0.1e-51.el7_2.7.ppc64le

The top of the log looks like this, thru the point of the error:

[root@st03d ambari-agent]# pwd /var/log/ambari-agent [root@st03d ambari-agent]# head -15 ambari-agent.log INFO 2017-01-09 20:59:48,961 main.py:90 - loglevel=logging.INFO INFO 2017-01-09 20:59:48,961 main.py:90 - loglevel=logging.INFO INFO 2017-01-09 20:59:48,961 main.py:90 - loglevel=logging.INFO INFO 2017-01-09 20:59:48,963 DataCleaner.py:39 - Data cleanup thread started INFO 2017-01-09 20:59:48,965 DataCleaner.py:120 - Data cleanup started INFO 2017-01-09 20:59:48,966 DataCleaner.py:122 - Data cleanup finished INFO 2017-01-09 20:59:49,069 PingPortListener.py:50 - Ping port listener started on port: 8670 INFO 2017-01-09 20:59:49,071 main.py:349 - Connecting to Ambari server at https://st03d.pbm.ihost.com:8440 (129.40.28.71) INFO 2017-01-09 20:59:49,071 NetUtil.py:62 - Connecting to https://st03d.pbm.ihost.com:8440/ca ERROR 2017-01-09 21:35:22,777 NetUtil.py:88 - EOF occurred in violation of protocol (_ssl.c:765) ERROR 2017-01-09 21:35:22,777 NetUtil.py:89 - SSLError: Failed to connect. Please check openssl library versions. Refer to: https://bugzilla.redhat.com/show_bug.cgi?id=1022468 for more details. WARNING 2017-01-09 21:35:22,777 NetUtil.py:116 - Server at https://st03d.pbm.ihost.com:8440 is not reachable, sleeping for 10 seconds... INFO 2017-01-09 21:35:32,777 NetUtil.py:62 - Connecting to https://st03d.pbm.ihost.com:8440/ca WARNING 2017-01-09 21:35:32,778 NetUtil.py:93 - Failed to connect to https://st03d.pbm.ihost.com:8440/ca due to [Errno 111] Connection refused

This happens when trying to set up 2-3 nodes, but can also be produced on a single node. Other cluster installs have succeeded, both multi-node and single-node, so the issue is likely something unique to a specific configuration difference that we can't pin down. At the moment we are stumped.

What might be wrong here? What else should we look for?

1 ACCEPTED SOLUTION

avatar
Explorer

It appears we've discovered the issue. The server has a large number of cores, 160.

We have edited /etc/ambari-server/conf/ambari.properties with:

agent.threadpool.size.max=128 client.threadpool.size.max=128

(Just increasing to largest ^2 closer to 160); restarting the ambari-server, it appears the clients now register successfully. We should now be able to proceed with the installation.

View solution in original post

1 REPLY 1

avatar
Explorer

It appears we've discovered the issue. The server has a large number of cores, 160.

We have edited /etc/ambari-server/conf/ambari.properties with:

agent.threadpool.size.max=128 client.threadpool.size.max=128

(Just increasing to largest ^2 closer to 160); restarting the ambari-server, it appears the clients now register successfully. We should now be able to proceed with the installation.