Hello ,
I installed Ambari server and Ambari agent but I have ten day that I try to connect ,I receive this error in ambari-agent.log:
INFO 2019-11-06 09:46:39,682 NetUtil.py:62 - Connecting to https://localhost:8440/ca WARNING 2019-11-06 09:46:39,683 NetUtil.py:93 - Failed to connect to https://localhost:8440/ca due to [Errno 111] Connection refused WARNING 2019-11-06 09:46:39,683 NetUtil.py:116 - Server at https://localhost:8440 is not reachable, sleeping for 10 seconds... INFO 2019-11-06 09:46:49,683 NetUtil.py:62 - Connecting to https://localhost:8440/ca WARNING 2019-11-06 09:46:49,684 NetUtil.py:93 - Failed to connect to https://localhost:8440/ca due to [Errno 111] Connection refused WARNING 2019-11-06 09:46:49,685 NetUtil.py:116 - Server at https://localhost:8440 is not reachable, sleeping for 10 seconds... |
And in Ambari-server I have:
05 Nov 2019 11:28:49,329 INFO [ambari-hearbeat-monitor] TopologyManager:487 - Hearbeat for host localhost.localdomain lost thus removing it from available hosts. 05 Nov 2019 11:39:58,408 WARN [qtp-ambari-agent-106] SecurityFilter:103 - Request https://localhost:8440/ca doesn't match any pattern. 05 Nov 2019 11:39:58,409 WARN [qtp-ambari-agent-106] SecurityFilter:62 - This request is not allowed on this port: https://localhost:8440/ca 05 Nov 2019 11:40:00,783 INFO [qtp-ambari-agent-106] HeartBeatHandler:425 - agentOsType = centos7 05 Nov 2019 11:40:00,787 INFO [qtp-ambari-agent-106] HostImpl:294 - Received host registration, host=[hostname=sqlcls,fqdn=sqlcls.yacine.lab,domain=yacine.lab,architecture=x86_64,processorcount=4,physicalprocessorcount=4,osname=centos,osversion=7.7.1908,osfamily=redhat,memory=16266600,uptime_hours=79,mounts=(available=8121532,mountpoint=/dev,used=0,percent=0%,size=8121532,device=devtmpfs,type=devtmpfs)(available=8133300,mountpoint=/dev/shm,used=0,percent=0%,size=8133300,device=tmpfs,type=tmpfs)(available=8124108,mountpoint=/run,used=9192,percent=1%,size=8133300,device=tmpfs,type=tmpfs)(available=47411672,mountpoint=/,used=4991528,percent=10%,size=52403200,device=/dev/mapper/centos-root,type=xfs)(available=786336,mountpoint=/boot,used=252000,percent=25%,size=1038336,device=/dev/sda1,type=xfs)(available=147866840,mountpoint=/home,used=33004,percent=1%,size=147899844,device=/dev/mapper/centos-home,type=xfs)(available=1626660,mountpoint=/run/user/0,used=0,percent=0%,size=1626660,device=tmpfs,type=tmpfs)] , registrationTime=1572950400783, agentVersion=2.4.0.1 05 Nov 2019 11:40:00,799 INFO [qtp-ambari-agent-106] TopologyManager:408 - TopologyManager.onHostRegistered: Entering 05 Nov 2019 11:40:00,800 INFO [qtp-ambari-agent-106] TopologyManager:469 - TopologyManager: Queueing available host sqlcls.yacine.lab 05 Nov 2019 13:17:32,311 WARN [Thread-2] QueuedThreadPool:145 - 1 threads could not be stopped 05 Nov 2019 13:17:32,311 WARN [Thread-2] QueuedThreadPool:145 - 1 threads could not be stopped 05 Nov 2019 13:17:32,311 INFO [main] AmbariServer:631 - Joined the Server |
Any ideas please I have tried a lot of solution but the problem still persists?
Thank you
Created on 11-06-2019 02:43 AM - edited 11-06-2019 02:45 AM
Your "/etc/hosts" file should have the FQDN info of AmbariServer (do not use localhost) to point to ambari server.
Also please change the "ambari-agent.ini" to point to the AmbariServer FQDN instead of using "localhost"
On each Agent node, stop the Agent.
# ambari-agent stop
Using a text editor, edit /etc/ambari-agent/conf/ambari-agent.ini to point to the new host.
[server]
hostname=$NEW FULLY.QUALIFIED.DOMAIN.NAME
url_port=8440
secured_url_port=8441
.
Created 11-06-2019 03:43 AM
As we see that you are getting following kind of error "certificate_unknown" in AmbariServer logs
06 Nov 2019 12:06:35,805 WARN [qtp-ambari-agent-68] nio:720 - javax.net.ssl.SSLException: Received fatal alert: certificate_unknown
06 Nov 2019 12:06:35,904 WARN [qtp-ambari-agent-67] SecurityFilter:103 - Request <a href="https://192.168.253.45:8440/" target="_blank">https://192.168.253.45:8440/</a> doesn't match any patter
Hence you should try to delete the OLD certificates from agent machine and then try again. As the old Agent certificates might be having incorrect names.
# rm /var/lib/ambari-agent/keys/*
# ambari-agent restart
While restarting it should fetch correct certificates from ambari server.
.
.
If your question is answered then, Please make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Created 11-06-2019 02:25 PM
There is no issue with ambari agent.
Created on 11-06-2019 01:35 AM - edited 11-06-2019 01:35 AM
As we see the following message:
WARNING 2019-11-06 09:46:39,683 NetUtil.py:93 - Failed to connect to https://localhost:8440/ca due to [Errno 111] Connection refused
Which indicates that the Ambari Agent is trying to conect to the AmbariServer running in Localmachine "localhost". Ideally it should be using the FQDN of ambari server instead of "localhost:8440".
So can you please try this:
1). please edit the "/etc/ambari-agent/conf/ambari-agent.ini" hostname to point to the FQDN (fully Qualified Hostname) of AmbariServer instead of localhost.
Please make sure to replace the "ambariserverhost.example.com" with the actual FQDN of your ambari server.
Example:
# grep -A1 '\[server\]' /etc/ambari-agent/conf/ambari-agent.ini
[server]
hostname=ambariserverhost.example.com
2. Then restart Ambari Agent.
# ambari-agent restart
# cat /etc/hosts
3. Now verify iof you are able to connect to the mentioned AmbariServer FQDN and Port from the agent machine as following?
# telnet ambariserverhost.example.com 8440
(OR)
# nc -v ambariserverhost.example.com 8440
4. Also please make sure that the iptables/firewall is disabled on AmbariServer host and port 8440 is listening
Example:
# netstat -tnlpa | grep `cat /var/run/ambari-server/ambari-server.pid`
tcp6 0 0 :::8080 :::* LISTEN 7463/java
tcp6 0 0 :::8440 :::* LISTEN 7463/java
.
Created 11-06-2019 02:38 AM
Thank you for your answer @jsensharma
You find here the result of the steps that you mentioned:
1-
[root@localhost /]# grep -A1 '\[server\]' /etc/ambari-agent/conf/ambari-agent.ini [server] hostname=localhost |
2-
[root@localhost /]# cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.253.46 pocserver2 pocserver2 192.168.253.47 pocserver3 pocserver3 192.168.253.45 localhost localhost |
3-
[root@localhost /]# nc -v localhost 8440 Ncat: Version 7.50 ( https://nmap.org/ncat ) Ncat: Connected to ::1:8440. |
And it is blocked here without another output
4-
[root@localhost /]# netstat -tnlpa | grep `cat /var/run/ambari-server/ambari-server.pid` tcp6 0 0 :::8440 :::* LISTEN 32030/java tcp6 0 0 :::8441 :::* LISTEN 32030/java tcp6 0 0 :::8443 :::* LISTEN 32030/java |
Created on 11-06-2019 02:43 AM - edited 11-06-2019 02:45 AM
Your "/etc/hosts" file should have the FQDN info of AmbariServer (do not use localhost) to point to ambari server.
Also please change the "ambari-agent.ini" to point to the AmbariServer FQDN instead of using "localhost"
On each Agent node, stop the Agent.
# ambari-agent stop
Using a text editor, edit /etc/ambari-agent/conf/ambari-agent.ini to point to the new host.
[server]
hostname=$NEW FULLY.QUALIFIED.DOMAIN.NAME
url_port=8440
secured_url_port=8441
.
Created 11-06-2019 03:35 AM
@jsensharma I have the same output (like result's steps mentioned before) with a hostname=ambari.server ( the hostname change) but the contents of ambari-server.log and ambari-agent.log is changed, you find below the contents :
ambari-agent.log:
Directory: '/etc/resource_overrides' does not exist - it won't be used for gathering system resources. INFO 2019-11-06 12:02:52,787 Controller.py:160 - Registering with ambari.server (192.168.253.45) (agent='{"hardwareProfile": {"kernel": "Linux", "domain": "server", "physicalprocessorcount": 4, "kernelrelease": "3.10.0-1062.1.2.el7.x86_64", "uptime_days": "0", "memorytotal": 16266600, "swapfree": "7.87 GB", "memorysize": 16266600, "osfamily": "redhat", "swapsize": "7.87 GB", "processorcount": 4, "netmask": "255.255.255.0", "timezone": "CET", "hardwareisa": "x86_64", "memoryfree": 14903648, "operatingsystem": "centos", "kernelmajversion": "3.10", "kernelversion": "3.10.0", "macaddress": "00:50:56:90:FB:2D", "operatingsystemrelease": "7.7.1908", "ipaddress": "192.168.253.45", "hostname": "ambari", "uptime_hours": "22", "fqdn": "ambari.server", "id": "root", "architecture": "x86_64", "selinux": false, "mounts": [{"available": "8121532", "used": "0", "percent": "0%", "device": "devtmpfs", "mountpoint": "/dev", "type": "devtmpfs", "size": "8121532"}, {"available": "8133300", "used": "0", "percent": "0%", "device": "tmpfs", "mountpoint": "/dev/shm", "type": "tmpfs", "size": "8133300"}, {"available": "8124172", "used": "9128", "percent": "1%", "device": "tmpfs", "mountpoint": "/run", "type": "tmpfs", "size": "8133300"}, {"available": "47410156", "used": "4993044", "percent": "10%", "device": "/dev/mapper/centos-root", "mountpoint": "/", "type": "xfs", "size": "52403200"}, {"available": "786336", "used": "252000", "percent": "25%", "device": "/dev/sda1", "mountpoint": "/boot", "type": "xfs", "size": "1038336"}, {"available": "147866840", "used": "33004", "percent": "1%", "device": "/dev/mapper/centos-home", "mountpoint": "/home", "type": "xfs", "size": "147899844"}, {"available": "1626660", "used": "0", "percent": "0%", "device": "tmpfs", "mountpoint": "/run/user/0", "type": "tmpfs", "size": "1626660"}], "hardwaremodel": "x86_64", "uptime_seconds": "81903", "interfaces": "ens192,lo"}, "currentPingPort": 8670, "prefix": "/var/lib/ambari-agent/data", "agentVersion": "2.4.0.1", "agentEnv": {"transparentHugePage": "", "hostHealth": {"agentTimeStampAtReporting": 1573038172784, "activeJavaProcs": [], "liveServices": [{"status": "Healthy", "name": "ntpd", "desc": ""}]}, "reverseLookup": true, "alternatives": [], "umask": "18", "firewallName": "iptables", "stackFoldersAndFiles": [], "existingUsers": [], "firewallRunning": false}, "timestamp": 1573038172724, "hostname": "ambari.server", "responseId": -1, "publicHostname": "ambari.server"}') INFO 2019-11-06 12:02:52,788 NetUtil.py:62 - Connecting to https://ambari.server:8440/connection_info INFO 2019-11-06 12:02:52,876 security.py:100 - SSL Connect being called.. connecting to the server INFO 2019-11-06 12:02:52,961 security.py:61 - SSL connection established. Two-way SSL authentication is turned off on the server. INFO 2019-11-06 12:02:52,980 Controller.py:186 - Registration Successful (response id = 0) INFO 2019-11-06 12:02:52,980 AmbariConfig.py:273 - Updating config property (agent.check.remote.mounts) with value (false) INFO 2019-11-06 12:02:52,981 AmbariConfig.py:273 - Updating config property (agent.auto.cache.update) with value (true) INFO 2019-11-06 12:02:52,981 AmbariConfig.py:273 - Updating config property (agent.check.mounts.timeout) with value (0) WARNING 2019-11-06 12:02:52,981 AlertSchedulerHandler.py:104 - There are no alert definition commands in the heartbeat; unable to update definitions INFO 2019-11-06 12:02:52,981 Controller.py:463 - Registration response from ambari.server was OK INFO 2019-11-06 12:02:52,981 Controller.py:468 - Resetting ActionQueue... INFO 2019-11-06 12:03:02,992 Controller.py:277 - Heartbeat with server is running... INFO 2019-11-06 12:03:02,993 Heartbeat.py:90 - Adding host info/state to heartbeat message. INFO 2019-11-06 12:03:03,059 logger.py:71 - call[['test', '-w', '/dev']] {'sudo': True, 'timeout': 5} INFO 2019-11-06 12:03:03,069 logger.py:71 - call returned (0, '') INFO 2019-11-06 12:03:03,069 logger.py:71 - call[['test', '-w', '/dev/shm']] {'sudo': True, 'timeout': 5} |
ambari-server.log:
TopologyManager:462 - Host ambari.server re-registered, will not be added to the available hosts list 06 Nov 2019 12:03:25,895 INFO [Thread-22] AbstractPoolBackedDataSource:212 - Initializing c3p0 pool... com.mchange.v2.c3p0.ComboPooledDataSource [ acquireIncrement -> 3, acquireRetryAttempts -> 30, acquireRetryDelay -> 1000, autoCommitOnClose -> false, automaticTestTable -> null, breakAfterAcquireFailure -> false, checkoutTimeout -> 0, connectionCustomizerClassName -> null, connectionTesterClassName -> com.mchange.v2.c3p0.impl.DefaultConnectionTester, contextClassLoaderSource -> caller, dataSourceName -> 1hgfewda6hv1c2v1pcxadq|64b242b3, debugUnreturnedConnectionStackTraces -> false, description -> null, driverClass -> org.postgresql.Driver, extensions -> {}, factoryClassLocation -> null, forceIgnoreUnresolvedTransactions -> false, forceSynchronousCheckins -> false, forceUseNamedDriverClass -> false, identityToken -> 1hgfewda6hv1c2v1pcxadq|64b242b3, idleConnectionTestPeriod -> 50, initialPoolSize -> 3, jdbcUrl -> jdbc:postgresql://localhost/datalake, maxAdministrativeTaskTime -> 0, maxConnectionAge -> 0, maxIdleTime -> 0, maxIdleTimeExcessConnections -> 0, maxPoolSize -> 5, maxStatements -> 0, maxStatementsPerConnection -> 120, minPoolSize -> 1, numHelperThreads -> 3, preferredTestQuery -> select 0, privilegeSpawnedThreads -> false, properties -> {user=******, password=******}, propertyCycle -> 0, statementCacheNumDeferredCloseThreads -> 0, testConnectionOnCheckin -> true, testConnectionOnCheckout -> false, unreturnedConnectionTimeout -> 0, userOverrides -> {}, usesTraditionalReflectiveProxies -> false ] 06 Nov 2019 12:03:25,966 INFO [Thread-22] JobStoreTX:861 - Freed 0 triggers from 'acquired' / 'blocked' state. 06 Nov 2019 12:03:25,977 INFO [Thread-22] JobStoreTX:871 - Recovering 0 jobs that were in-progress at the time of the last shut-down. 06 Nov 2019 12:03:25,977 INFO [Thread-22] JobStoreTX:884 - Recovery complete. 06 Nov 2019 12:03:25,978 INFO [Thread-22] JobStoreTX:891 - Removed 0 'complete' triggers. 06 Nov 2019 12:03:25,978 INFO [Thread-22] JobStoreTX:896 - Removed 0 stale fired job entries. 06 Nov 2019 12:03:25,979 INFO [Thread-22] QuartzScheduler:575 - Scheduler ExecutionScheduler_$_NON_CLUSTERED started. 06 Nov 2019 12:06:35,805 WARN [qtp-ambari-agent-66] nio:720 - javax.net.ssl.SSLException: Received fatal alert: certificate_unknown 06 Nov 2019 12:06:35,805 WARN [qtp-ambari-agent-66] nio:720 - javax.net.ssl.SSLException: Received fatal alert: certificate_unknown 06 Nov 2019 12:06:35,805 WARN [qtp-ambari-agent-68] nio:720 - javax.net.ssl.SSLException: Received fatal alert: certificate_unknown 06 Nov 2019 12:06:35,805 WARN [qtp-ambari-agent-68] nio:720 - javax.net.ssl.SSLException: Received fatal alert: certificate_unknown 06 Nov 2019 12:06:35,904 WARN [qtp-ambari-agent-67] SecurityFilter:103 - Request https://192.168.253.45:8440/ doesn't match any patter |
Created 11-06-2019 03:43 AM
As we see that you are getting following kind of error "certificate_unknown" in AmbariServer logs
06 Nov 2019 12:06:35,805 WARN [qtp-ambari-agent-68] nio:720 - javax.net.ssl.SSLException: Received fatal alert: certificate_unknown
06 Nov 2019 12:06:35,904 WARN [qtp-ambari-agent-67] SecurityFilter:103 - Request <a href="https://192.168.253.45:8440/" target="_blank">https://192.168.253.45:8440/</a> doesn't match any patter
Hence you should try to delete the OLD certificates from agent machine and then try again. As the old Agent certificates might be having incorrect names.
# rm /var/lib/ambari-agent/keys/*
# ambari-agent restart
While restarting it should fetch correct certificates from ambari server.
.
.
If your question is answered then, Please make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Created 11-06-2019 03:48 AM
the /var/lib/ambari-agent/keys directory is empty, there is no files, what is the problem ?
Created 11-06-2019 02:25 PM
There is no issue with ambari agent.