Support Questions

Find answers, ask questions, and share your expertise

how i can resolve this problem " is no sending heartbeat"

avatar
Explorer
 
1 ACCEPTED SOLUTION

avatar

Hi @yadir Aguilar,

your last comment was informatory.

can you please try adding the following option to security section in "/etc/amabri-agent/conf/ambari-agent.ini" and restart ambari-agent

[security] 
force_https_protocol=PROTOCOL_TLSv1_2

i feel you are hitting this bug : https://issues.apache.org/jira/browse/AMBARI-17666

reference : https://community.hortonworks.com/content/supportkb/208283/error-2018-07-16-005228887-netutilpy96-eo...

Hope this helps you, please mark aswer as accept if it did 🙂

View solution in original post

11 REPLIES 11

avatar

@yadir Aguilar

Please check ambari-agent logs to find out why Agent is not sending the heat beat? logs would be at /var/log/ambari-agent/ambari-agent.log

avatar
Explorer

this is the answer from ambari-agent.log:


INFORMACIÓN 2018-08-20 16: 25: 11,790 main.py:147 - loglevel = logging.INFO INFO 2018-08-20 16: 25: 11,790 main.py:147 - loglevel = logging.INFO INFO 2018-08-20 16: 25: 11.790 main.py:147 - loglevel = logging.INFO INFO 2018-08-20 16: 25: 11.792 DataCleaner.py:39 - Se inició el hilo de limpieza de datos INFO 2018-08-20 16: 25: 11.793 DataCleaner. py: 120 - Se inició la limpieza de datos INFO 2018-08-20 16: 25: 11.794 DataCleaner.py:122 - Limpieza de datos finalizada INFO 2018-08-20 16: 25: 11.794 hostname.py:67 - agent: hostname_script configuration no defined por lo tanto, lea el nombre de host 'esclavo.hdp.com' usando socket.getfqdn (). INFORMACIÓN 2018-08-20 16: 25: 11,799 PingPortListener.py:50 - Escucha de puerto de ping iniciada en el puerto: 8670 INFO 2018-08-20 16: 25: 11,802 main.py:439 - Conexión al servidor Ambari en https: / /maestro.hdp.com:8440 (10.137.44.53) INFORMACIÓN 2018-08-20 16: 25: 11,802 NetUtil.py:70 - Conexión a https: //maestro.hdp.com:8440/ca INFO 2018-08-20 16: 25: 11,874 main.py:449 - Conectado al servidor de Ambari maestro.hdp.com INFO 2018-08-20 16: 25: 11,875 threadpool. py: 58 - Grupo de subprocesos iniciado con 3 subprocesos principales y 20 subprocesos máximos ADVERTENCIA 2018-08-20 16: 25: 11,876 AlertSchedulerHandler.py:280 - [AlertScheduler] / var / lib / ambari-agent / cache / alerts / definitions. json no encontrado o inválido. No se programarán alertas hasta que se realice el registro. INFORMACIÓN 2018-08-20 16: 25: 11,876 AlertSchedulerHandler.py:175 - [AlertScheduler] Iniciando el objeto <ambari_agent.apscheduler.scheduler.Scheduler en 0x7fa4efe427d0>; actualmente en ejecución: False INFO 2018-08-20 16: 25: 13,926 hostname.py:106 - Leer el nombre de host público 'esclavo.hdp.com' usando socket.getfqdn () INFO 2018-08-20 16: 25: 13,930 Hardware. py: 68 - Inicializando la información del sistema host. INFORMACIÓN 2018-08-20 16:25:14, 015 Hardware.py:188 - Se ignoraron algunos puntos de montaje: / dev / shm, / run, / sys / fs / cgroup, / run / user / 0 INFO 2018-08-20 16: 25: 14,074 hostname.py:67 - agent: configuración de hostname_script no definida, por lo tanto, lea el nombre de host 'esclavo.hdp.com' usando socket.getfqdn (). INFO 2018-08-20 16: 25: 14,079 Facter.py:202 - Directorio: '/ etc / resource_overrides' no existe - no se usará para reunir recursos del sistema.


yo dont'n sure what is the probelm

avatar

Hi @yadir Aguilar,

From the logs

Connection to https: //maestro.hdp.com:8440/ca INFO 2018 -08-20 16: 25: 11,874 main.py:449 - Connected to the Ambari server maestro.hdp.com 

It looks your ambari-agent is trying to connect to maestro.hdp.com and its connected successfully too.

can you try to restart ambari-agent once and see if that helps

ambari-agent restart

don't see any specific error in ambari-agent logs commented in here. look out for ERROR in the log. what you have attached is all warnings and try to attach in code format

i am code format

Hope this helps you.

avatar
Explorer

i found the error:

WARNING 2018-08-20 16:32:40,475 base_alert.py:138 - [Alert][smartsense_gateway_status] Unable to execute alert. [Alert][smartsense_gateway_status] Unable to extract JSON from JMX response ERROR 2018-08-20 16:32:40,481 script_alert.py:123 - [Alert][yarn_nodemanager_health] Failed with result CRITICAL: ['Connection failed to http://esclavo.hdp.com:8042/ws/v1/node/info (Traceback (most recent call last):\n File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/alerts/alert_nodemanager_health.py", line 171, in execute\n url_response = urllib2.urlopen(query, timeout=connection_timeout)\n File "/usr/lib64/python2.7/urllib2.py", line 154, in urlopen\n return opener.open(url, data, timeout)\n File "/usr/lib64/python2.7/urllib2.py", line 431, in open\n response = self._open(req, data)\n File "/usr/lib64/python2.7/urllib2.py", line 449, in _open\n \'_open\', req)\n File "/usr/lib64/python2.7/urllib2.py", line 409, in _call_chain\n result = func(*args)\n File "/usr/lib64/python2.7/urllib2.py", line 1244, in http_open\n return self.do_open(httplib.HTTPConnection, req)\n File "/usr/lib64/python2.7/urllib2.py", line 1214, in do_open\n raise URLError(err)\nURLError: <urlopen error [Errno 111] Connection refused>\n)'] ERROR 2018-08-20 16:32:40,481 script_alert.py:123 - [Alert][yarn_nodemanager_health] Failed with result CRITICAL: ['Connection failed to http://esclavo.hdp.com:8042/ws/v1/node/info (Traceback (most recent call last):\n File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/alerts/alert_nodemanager_health.py", line 171, in execute\n url_response = urllib2.urlopen(query, timeout=connection_timeout)\n File "/usr/lib64/python2.7/urllib2.py", line 154, in urlopen\n return opener.open(url, data, timeout)\n File "/usr/lib64/python2.7/urllib2.py", line 431, in open\n response = self._open(req, data)\n File "/usr/lib64/python2.7/urllib2.py", line 449, in _open\n \'_open\', req)\n File "/usr/lib64/python2.7/urllib2.py", line 409, in _call_chain\n result = func(*args)\n File "/usr/lib64/python2.7/urllib2.py", line 1244, in http_open\n return self.do_open(httplib.HTTPConnection, req)\n File "/usr/lib64/python2.7/urllib2.py", line 1214, in do_open\n raise URLError(err)\nURLError: <urlopen error [Errno 111] Connection refused>\n)']

but i don't know how sresolve it

avatar

Hi @yadir Aguilar ,

It looks like some smart sense is not responding.

can you perform the following steps and see if it helps

1) execute :

ambari-agent restart

2) see whats output of this command :

/usr/sbin/hst agent-status

3)if output of command-2 hangs , try restarting hst-server from ambari-ui and see if the hearbeat come's back.

Hope this troubleshooting helps you

avatar
Explorer

when i enter : /usr/sbin/hst agent-status get "registered", what is the commando to restart the hst server

avatar

Hi @yadir Aguilar,

Looks like your hst-agent is ok.

What are you seeing when you do this command .

[root@asnaikh ~]# cat /var/log/ambari-agent/ambari-agent.log |grep -i heartbeat
INFO 2018-08-21 14:36:13,697 Controller.py:311 - Building heartbeat message
INFO 2018-08-21 14:36:13,699 Heartbeat.py:87 - Adding host info/state to heartbeat message.
INFO 2018-08-21 14:36:14,001 Controller.py:320 - Sending Heartbeat (id = 2841)
INFO 2018-08-21 14:36:14,005 Controller.py:333 - Heartbeat response received (id = 2842)
INFO 2018-08-21 14:36:14,005 Controller.py:342 - Heartbeat interval is 1 seconds
INFO 2018-08-21 14:36:14,005 Controller.py:380 - Updating configurations from heartbeat
INFO 2018-08-21 14:36:14,006 Controller.py:475 - Waiting 0.9 for next heartbeat

just to figure out if its ambari-agent issue or ambari-server issue.

Can you try to restart ambari-server

ambari-server restart

and see if it helps.

also grep for

[root@anaikhdf1 ~]# cat /var/log/ambari-server/ambari-server.log |grep -i heartbeat|grep -i <problem_nodeFQDN>

avatar
Explorer

when enter: [root @ asnaikh ~] # cat /var/log/ambari-agent/ambari-agent.log | grep -i latido del corazón ->

INFO 2018-08-21 08:58:29,074 HeartbeatHandlers.py:84 - Ambari-agent received 15 signal, stopping... INFO 2018-08-21 08:58:29,890 HeartbeatHandlers.py:116 - Stop event received INFO 2018-08-21 08:58:29,890 Controller.py:503 - Finished heartbeating and regis tering cycle INFO 2018-08-21 09:44:03,002 HeartbeatHandlers.py:84 - Ambari-agent received 15 signal, stopping... INFO 2018-08-21 09:44:12,704 HeartbeatHandlers.py:116 - Stop event received INFO 2018-08-21 09:53:12,898 HeartbeatHandlers.py:84 - Ambari-agent received 15 signal, stopping... INFO 2018-08-21 09:53:15,992 HeartbeatHandlers.py:116 - Stop event received INFO 2018-08-21 10:42:50,625 HeartbeatHandlers.py:84 - Ambari-agent received 15 signal, stopping... INFO 2018-08-21 10:42:52,086 HeartbeatHandlers.py:116 - Stop event received INFO 2018-08-21 10:53:41,375 HeartbeatHandlers.py:84 - Ambari-agent received 15 signal, stopping... INFO 2018-08-21 10:53:44,646 HeartbeatHandlers.py:116 - Stop event received

avatar
Explorer

Hi @Akhil S Naik

i just found this:

[root@maestro ~]# systemctl status ambari-server

● ambari-server.service - LSB: ambari-server daemon

Loaded: loaded (/etc/rc.d/init.d/ambari-server; bad; vendor preset: disabled)

Active: failed (Result: exit-code) since Tue 2018-08-21 10:48:01 EDT; 1min 22s ago Docs: man:systemd-sysv-generator(8) Process: 24652 ExecStart=/etc/rc.d/init.d/ambari-server start (code=exited, status=1/FAILURE)

Aug 21 10:48:01 maestro.hdp.com systemd[1]: Starting LSB: ambari-server daem.... Aug 21 10:48:01 maestro.hdp.com ambari-server[24652]: Using python /usr/bin/... Aug 21 10:48:01 maestro.hdp.com ambari-server[24652]: Starting ambari-server Aug 21 10:48:01 maestro.hdp.com ambari-server[24652]: ERROR: Exiting with exi... Aug 21 10:48:01 maestro.hdp.com ambari-server[24652]: REASON: Ambari Server i... Aug 21 10:48:01 maestro.hdp.com systemd[1]: ambari-server.service: control p...1 Aug 21 10:48:01 maestro.hdp.com systemd[1]: Failed to start LSB: ambari-serv.... Aug 21 10:48:01 maestro.hdp.com systemd[1]: Unit ambari-server.service enter.... Aug 21 10:48:01 maestro.hdp.com systemd[1]: ambari-server.service failed. Hint: Some lines were ellipsized, use -l to show in full.

the same with the ambari-agent