Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Sujet : Failed to connect to https://localhost:8440/ca due to [Errno 111] Connection refused, Server at https://localhost:8440 is not reachable, sleeping for 10 seconds...

Solved Go to solution

Sujet : Failed to connect to https://localhost:8440/ca due to [Errno 111] Connection refused, Server at https://localhost:8440 is not reachable, sleeping for 10 seconds...

Explorer

Hello ,
I installed Ambari server and Ambari agent but I have  ten day that I try to connect ,I receive this error in ambari-agent.log: 

INFO 2019-11-06 09:46:39,682 NetUtil.py:62 - Connecting to https://localhost:8440/ca
WARNING 2019-11-06 09:46:39,683 NetUtil.py:93 - Failed to connect to https://localhost:8440/ca due to [Errno 111] Connection refused
WARNING 2019-11-06 09:46:39,683 NetUtil.py:116 - Server at https://localhost:8440 is not reachable, sleeping for 10 seconds...
INFO 2019-11-06 09:46:49,683 NetUtil.py:62 - Connecting to https://localhost:8440/ca
WARNING 2019-11-06 09:46:49,684 NetUtil.py:93 - Failed to connect to https://localhost:8440/ca due to [Errno 111] Connection refused
WARNING 2019-11-06 09:46:49,685 NetUtil.py:116 - Server at https://localhost:8440 is not reachable, sleeping for 10 seconds...

 And in Ambari-server I have: 

05 Nov 2019 11:28:49,329 INFO [ambari-hearbeat-monitor] TopologyManager:487 - Hearbeat for host localhost.localdomain lost thus removing it from available hosts.
05 Nov 2019 11:39:58,408 WARN [qtp-ambari-agent-106] SecurityFilter:103 - Request https://localhost:8440/ca doesn't match any pattern.
05 Nov 2019 11:39:58,409 WARN [qtp-ambari-agent-106] SecurityFilter:62 - This request is not allowed on this port: https://localhost:8440/ca
05 Nov 2019 11:40:00,783 INFO [qtp-ambari-agent-106] HeartBeatHandler:425 - agentOsType = centos7
05 Nov 2019 11:40:00,787 INFO [qtp-ambari-agent-106] HostImpl:294 - Received host registration, host=[hostname=sqlcls,fqdn=sqlcls.yacine.lab,domain=yacine.lab,architecture=x86_64,processorcount=4,physicalprocessorcount=4,osname=centos,osversion=7.7.1908,osfamily=redhat,memory=16266600,uptime_hours=79,mounts=(available=8121532,mountpoint=/dev,used=0,percent=0%,size=8121532,device=devtmpfs,type=devtmpfs)(available=8133300,mountpoint=/dev/shm,used=0,percent=0%,size=8133300,device=tmpfs,type=tmpfs)(available=8124108,mountpoint=/run,used=9192,percent=1%,size=8133300,device=tmpfs,type=tmpfs)(available=47411672,mountpoint=/,used=4991528,percent=10%,size=52403200,device=/dev/mapper/centos-root,type=xfs)(available=786336,mountpoint=/boot,used=252000,percent=25%,size=1038336,device=/dev/sda1,type=xfs)(available=147866840,mountpoint=/home,used=33004,percent=1%,size=147899844,device=/dev/mapper/centos-home,type=xfs)(available=1626660,mountpoint=/run/user/0,used=0,percent=0%,size=1626660,device=tmpfs,type=tmpfs)]
, registrationTime=1572950400783, agentVersion=2.4.0.1
05 Nov 2019 11:40:00,799 INFO [qtp-ambari-agent-106] TopologyManager:408 - TopologyManager.onHostRegistered: Entering
05 Nov 2019 11:40:00,800 INFO [qtp-ambari-agent-106] TopologyManager:469 - TopologyManager: Queueing available host sqlcls.yacine.lab
05 Nov 2019 13:17:32,311 WARN [Thread-2] QueuedThreadPool:145 - 1 threads could not be stopped
05 Nov 2019 13:17:32,311 WARN [Thread-2] QueuedThreadPool:145 - 1 threads could not be stopped
05 Nov 2019 13:17:32,311 INFO [main] AmbariServer:631 - Joined the Server

Any ideas please I have tried a lot of solution but the problem still persists?

 

Thank you

 

3 ACCEPTED SOLUTIONS

Accepted Solutions

Re: Sujet : Failed to connect to https://localhost:8440/ca due to [Errno 111] Connection refused, Server at https://localhost:8440 is not reachable, sleeping for 10 seconds...

Super Mentor

@Kou_Bou 

 

Your "/etc/hosts" file should have the FQDN info of AmbariServer  (do not use localhost) to point to ambari server.

 

Also please change the "ambari-agent.ini" to point to the  AmbariServer FQDN instead of using "localhost"

 

https://docs.cloudera.com/HDPDocuments/Ambari-2.7.4.0/administering-ambari/content/amb_install_the_a...

 

 

On each Agent node, stop the Agent.
# ambari-agent stop

Using a text editor, edit /etc/ambari-agent/conf/ambari-agent.ini to point to the new host.

[server]
hostname=$NEW FULLY.QUALIFIED.DOMAIN.NAME
url_port=8440
secured_url_port=8441

 

 

 

 

.

Highlighted

Re: Sujet : Failed to connect to https://localhost:8440/ca due to [Errno 111] Connection refused, Server at https://localhost:8440 is not reachable, sleeping for 10 seconds...

Super Mentor

@Kou_Bou 

As we see that you are getting following kind of error "certificate_unknown" in AmbariServer logs

06 Nov 2019 12:06:35,805 WARN [qtp-ambari-agent-68] nio:720 - javax.net.ssl.SSLException: Received fatal alert: certificate_unknown
06 Nov 2019 12:06:35,904 WARN [qtp-ambari-agent-67] SecurityFilter:103 - Request <a href="https://192.168.253.45:8440/" target="_blank">https://192.168.253.45:8440/</a> doesn't match any patter

Hence you should try to delete the OLD certificates from agent machine and then try again. As the old Agent certificates might be having incorrect names.

# rm /var/lib/ambari-agent/keys/*
# ambari-agent restart

While restarting it should fetch correct certificates from ambari server.


.

.
If your question is answered then, Please make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Re: Sujet : Failed to connect to https://localhost:8440/ca due to [Errno 111] Connection refused, Server at https://localhost:8440 is not reachable, sleeping for 10 seconds...

Contributor

There is no issue with ambari agent.

7 REPLIES 7

Re: Sujet : Failed to connect to https://localhost:8440/ca due to [Errno 111] Connection refused, Server at https://localhost:8440 is not reachable, sleeping for 10 seconds...

Super Mentor

@Kou_Bou 

 

As we see the following message:

WARNING 2019-11-06 09:46:39,683 NetUtil.py:93 - Failed to connect to https://localhost:8440/ca due to [Errno 111] Connection refused

Which indicates that the Ambari Agent is trying to conect to the AmbariServer running in Localmachine "localhost". Ideally it should be using the FQDN of ambari server instead of "localhost:8440".

So can you please try this:

1). please edit the "/etc/ambari-agent/conf/ambari-agent.ini" hostname to point to the FQDN (fully Qualified Hostname) of AmbariServer instead of localhost. 

Please make sure to replace the "ambariserverhost.example.com" with the actual FQDN of your ambari server.   

Example:

 

# grep -A1 '\[server\]' /etc/ambari-agent/conf/ambari-agent.ini
[server]
hostname=ambariserverhost.example.com

 

2. Then restart Ambari Agent.

 

# ambari-agent restart

# cat /etc/hosts

 


3. Now verify iof you are able to connect to the mentioned AmbariServer FQDN and Port from the agent machine as following?

 

# telnet ambariserverhost.example.com 8440
(OR)
# nc -v ambariserverhost.example.com 8440

 


4. Also please make sure that the iptables/firewall is disabled on AmbariServer host and port 8440 is listening

Example:

 

# netstat -tnlpa | grep `cat /var/run/ambari-server/ambari-server.pid`
tcp6 0 0 :::8080 :::* LISTEN 7463/java 
tcp6 0 0 :::8440 :::* LISTEN 7463/java

 

 

.

 

Re: Sujet : Failed to connect to https://localhost:8440/ca due to [Errno 111] Connection refused, Server at https://localhost:8440 is not reachable, sleeping for 10 seconds...

Explorer

Thank you for your answer @jsensharma 

 You find here the result of the steps that you mentioned:  

1-

[root@localhost /]# grep -A1 '\[server\]' /etc/ambari-agent/conf/ambari-agent.ini

[server]

hostname=localhost

2- 

[root@localhost /]# cat /etc/hosts

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4

::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.253.46 pocserver2 pocserver2

192.168.253.47 pocserver3 pocserver3

192.168.253.45 localhost localhost

3- 

[root@localhost /]# nc -v localhost 8440

Ncat: Version 7.50 ( https://nmap.org/ncat )

Ncat: Connected to ::1:8440.

And it is blocked here without another output

4- 

[root@localhost /]#  netstat -tnlpa | grep `cat /var/run/ambari-server/ambari-server.pid`

tcp6       0      0 :::8440                 :::*                    LISTEN      32030/java

tcp6       0      0 :::8441                 :::*                    LISTEN      32030/java

tcp6       0      0 :::8443                 :::*                    LISTEN      32030/java

Re: Sujet : Failed to connect to https://localhost:8440/ca due to [Errno 111] Connection refused, Server at https://localhost:8440 is not reachable, sleeping for 10 seconds...

Super Mentor

@Kou_Bou 

 

Your "/etc/hosts" file should have the FQDN info of AmbariServer  (do not use localhost) to point to ambari server.

 

Also please change the "ambari-agent.ini" to point to the  AmbariServer FQDN instead of using "localhost"

 

https://docs.cloudera.com/HDPDocuments/Ambari-2.7.4.0/administering-ambari/content/amb_install_the_a...

 

 

On each Agent node, stop the Agent.
# ambari-agent stop

Using a text editor, edit /etc/ambari-agent/conf/ambari-agent.ini to point to the new host.

[server]
hostname=$NEW FULLY.QUALIFIED.DOMAIN.NAME
url_port=8440
secured_url_port=8441

 

 

 

 

.

Re: Sujet : Failed to connect to https://localhost:8440/ca due to [Errno 111] Connection refused, Server at https://localhost:8440 is not reachable, sleeping for 10 seconds...

Explorer

@jsensharma  I have the same output (like result's steps mentioned before) with a hostname=ambari.server ( the hostname change)  but the contents of ambari-server.log and ambari-agent.log is changed, you find below the contents : 

 ambari-agent.log: 

Directory: '/etc/resource_overrides' does not exist - it won't be used for gathering system resources.
INFO 2019-11-06 12:02:52,787 Controller.py:160 - Registering with ambari.server (192.168.253.45) (agent='{"hardwareProfile": {"kernel": "Linux", "domain": "server", "physicalprocessorcount": 4, "kernelrelease": "3.10.0-1062.1.2.el7.x86_64", "uptime_days": "0", "memorytotal": 16266600, "swapfree": "7.87 GB", "memorysize": 16266600, "osfamily": "redhat", "swapsize": "7.87 GB", "processorcount": 4, "netmask": "255.255.255.0", "timezone": "CET", "hardwareisa": "x86_64", "memoryfree": 14903648, "operatingsystem": "centos", "kernelmajversion": "3.10", "kernelversion": "3.10.0", "macaddress": "00:50:56:90:FB:2D", "operatingsystemrelease": "7.7.1908", "ipaddress": "192.168.253.45", "hostname": "ambari", "uptime_hours": "22", "fqdn": "ambari.server", "id": "root", "architecture": "x86_64", "selinux": false, "mounts": [{"available": "8121532", "used": "0", "percent": "0%", "device": "devtmpfs", "mountpoint": "/dev", "type": "devtmpfs", "size": "8121532"}, {"available": "8133300", "used": "0", "percent": "0%", "device": "tmpfs", "mountpoint": "/dev/shm", "type": "tmpfs", "size": "8133300"}, {"available": "8124172", "used": "9128", "percent": "1%", "device": "tmpfs", "mountpoint": "/run", "type": "tmpfs", "size": "8133300"}, {"available": "47410156", "used": "4993044", "percent": "10%", "device": "/dev/mapper/centos-root", "mountpoint": "/", "type": "xfs", "size": "52403200"}, {"available": "786336", "used": "252000", "percent": "25%", "device": "/dev/sda1", "mountpoint": "/boot", "type": "xfs", "size": "1038336"}, {"available": "147866840", "used": "33004", "percent": "1%", "device": "/dev/mapper/centos-home", "mountpoint": "/home", "type": "xfs", "size": "147899844"}, {"available": "1626660", "used": "0", "percent": "0%", "device": "tmpfs", "mountpoint": "/run/user/0", "type": "tmpfs", "size": "1626660"}], "hardwaremodel": "x86_64", "uptime_seconds": "81903", "interfaces": "ens192,lo"}, "currentPingPort": 8670, "prefix": "/var/lib/ambari-agent/data", "agentVersion": "2.4.0.1", "agentEnv": {"transparentHugePage": "", "hostHealth": {"agentTimeStampAtReporting": 1573038172784, "activeJavaProcs": [], "liveServices": [{"status": "Healthy", "name": "ntpd", "desc": ""}]}, "reverseLookup": true, "alternatives": [], "umask": "18", "firewallName": "iptables", "stackFoldersAndFiles": [], "existingUsers": [], "firewallRunning": false}, "timestamp": 1573038172724, "hostname": "ambari.server", "responseId": -1, "publicHostname": "ambari.server"}')
INFO 2019-11-06 12:02:52,788 NetUtil.py:62 - Connecting to https://ambari.server:8440/connection_info
INFO 2019-11-06 12:02:52,876 security.py:100 - SSL Connect being called.. connecting to the server
INFO 2019-11-06 12:02:52,961 security.py:61 - SSL connection established. Two-way SSL authentication is turned off on the server.
INFO 2019-11-06 12:02:52,980 Controller.py:186 - Registration Successful (response id = 0)
INFO 2019-11-06 12:02:52,980 AmbariConfig.py:273 - Updating config property (agent.check.remote.mounts) with value (false)
INFO 2019-11-06 12:02:52,981 AmbariConfig.py:273 - Updating config property (agent.auto.cache.update) with value (true)
INFO 2019-11-06 12:02:52,981 AmbariConfig.py:273 - Updating config property (agent.check.mounts.timeout) with value (0)
WARNING 2019-11-06 12:02:52,981 AlertSchedulerHandler.py:104 - There are no alert definition commands in the heartbeat; unable to update definitions
INFO 2019-11-06 12:02:52,981 Controller.py:463 - Registration response from ambari.server was OK
INFO 2019-11-06 12:02:52,981 Controller.py:468 - Resetting ActionQueue...
INFO 2019-11-06 12:03:02,992 Controller.py:277 - Heartbeat with server is running...
INFO 2019-11-06 12:03:02,993 Heartbeat.py:90 - Adding host info/state to heartbeat message.
INFO 2019-11-06 12:03:03,059 logger.py:71 - call[['test', '-w', '/dev']] {'sudo': True, 'timeout': 5}
INFO 2019-11-06 12:03:03,069 logger.py:71 - call returned (0, '')
INFO 2019-11-06 12:03:03,069 logger.py:71 - call[['test', '-w', '/dev/shm']] {'sudo': True, 'timeout': 5}

 

ambari-server.log:

TopologyManager:462 - Host ambari.server re-registered, will not be added to the available hosts list
06 Nov 2019 12:03:25,895 INFO [Thread-22] AbstractPoolBackedDataSource:212 - Initializing c3p0 pool... com.mchange.v2.c3p0.ComboPooledDataSource [ acquireIncrement -> 3, acquireRetryAttempts -> 30, acquireRetryDelay -> 1000, autoCommitOnClose -> false, automaticTestTable -> null, breakAfterAcquireFailure -> false, checkoutTimeout -> 0, connectionCustomizerClassName -> null, connectionTesterClassName -> com.mchange.v2.c3p0.impl.DefaultConnectionTester, contextClassLoaderSource -> caller, dataSourceName -> 1hgfewda6hv1c2v1pcxadq|64b242b3, debugUnreturnedConnectionStackTraces -> false, description -> null, driverClass -> org.postgresql.Driver, extensions -> {}, factoryClassLocation -> null, forceIgnoreUnresolvedTransactions -> false, forceSynchronousCheckins -> false, forceUseNamedDriverClass -> false, identityToken -> 1hgfewda6hv1c2v1pcxadq|64b242b3, idleConnectionTestPeriod -> 50, initialPoolSize -> 3, jdbcUrl -> jdbc:postgresql://localhost/datalake, maxAdministrativeTaskTime -> 0, maxConnectionAge -> 0, maxIdleTime -> 0, maxIdleTimeExcessConnections -> 0, maxPoolSize -> 5, maxStatements -> 0, maxStatementsPerConnection -> 120, minPoolSize -> 1, numHelperThreads -> 3, preferredTestQuery -> select 0, privilegeSpawnedThreads -> false, properties -> {user=******, password=******}, propertyCycle -> 0, statementCacheNumDeferredCloseThreads -> 0, testConnectionOnCheckin -> true, testConnectionOnCheckout -> false, unreturnedConnectionTimeout -> 0, userOverrides -> {}, usesTraditionalReflectiveProxies -> false ]
06 Nov 2019 12:03:25,966 INFO [Thread-22] JobStoreTX:861 - Freed 0 triggers from 'acquired' / 'blocked' state.
06 Nov 2019 12:03:25,977 INFO [Thread-22] JobStoreTX:871 - Recovering 0 jobs that were in-progress at the time of the last shut-down.
06 Nov 2019 12:03:25,977 INFO [Thread-22] JobStoreTX:884 - Recovery complete.
06 Nov 2019 12:03:25,978 INFO [Thread-22] JobStoreTX:891 - Removed 0 'complete' triggers.
06 Nov 2019 12:03:25,978 INFO [Thread-22] JobStoreTX:896 - Removed 0 stale fired job entries.
06 Nov 2019 12:03:25,979 INFO [Thread-22] QuartzScheduler:575 - Scheduler ExecutionScheduler_$_NON_CLUSTERED started.
06 Nov 2019 12:06:35,805 WARN [qtp-ambari-agent-66] nio:720 - javax.net.ssl.SSLException: Received fatal alert: certificate_unknown
06 Nov 2019 12:06:35,805 WARN [qtp-ambari-agent-66] nio:720 - javax.net.ssl.SSLException: Received fatal alert: certificate_unknown
06 Nov 2019 12:06:35,805 WARN [qtp-ambari-agent-68] nio:720 - javax.net.ssl.SSLException: Received fatal alert: certificate_unknown
06 Nov 2019 12:06:35,805 WARN [qtp-ambari-agent-68] nio:720 - javax.net.ssl.SSLException: Received fatal alert: certificate_unknown
06 Nov 2019 12:06:35,904 WARN [qtp-ambari-agent-67] SecurityFilter:103 - Request https://192.168.253.45:8440/ doesn't match any patter
Highlighted

Re: Sujet : Failed to connect to https://localhost:8440/ca due to [Errno 111] Connection refused, Server at https://localhost:8440 is not reachable, sleeping for 10 seconds...

Super Mentor

@Kou_Bou 

As we see that you are getting following kind of error "certificate_unknown" in AmbariServer logs

06 Nov 2019 12:06:35,805 WARN [qtp-ambari-agent-68] nio:720 - javax.net.ssl.SSLException: Received fatal alert: certificate_unknown
06 Nov 2019 12:06:35,904 WARN [qtp-ambari-agent-67] SecurityFilter:103 - Request <a href="https://192.168.253.45:8440/" target="_blank">https://192.168.253.45:8440/</a> doesn't match any patter

Hence you should try to delete the OLD certificates from agent machine and then try again. As the old Agent certificates might be having incorrect names.

# rm /var/lib/ambari-agent/keys/*
# ambari-agent restart

While restarting it should fetch correct certificates from ambari server.


.

.
If your question is answered then, Please make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Re: Sujet : Failed to connect to https://localhost:8440/ca due to [Errno 111] Connection refused, Server at https://localhost:8440 is not reachable, sleeping for 10 seconds...

Explorer

the /var/lib/ambari-agent/keys directory is empty, there is no files, what is the problem ?

 

Re: Sujet : Failed to connect to https://localhost:8440/ca due to [Errno 111] Connection refused, Server at https://localhost:8440 is not reachable, sleeping for 10 seconds...

Contributor

There is no issue with ambari agent.

Don't have an account?
Coming from Hortonworks? Activate your account here