Hello my friends,
I was wondering if you could help me with something. We're upgrading our environment to HDP 3.0.1, and for that we had to upgrade Ambari to 220.127.116.11. After the Ambari upgrade, the HDP 3.0.1 installation is not working due to critical errors with the hosts.
Aside from the server host, all other 3 we have in test environment are not sending heartbeats. I've tried many solutions already posted here in the community, but nothing seems to work. I've already restarted the server, checked the hosts IPs, and upgraded the other services (Metrics and SmartSense) to the latest version. Before the upgrade, everything was working fine, so I'm not sure if this is a known issue or if I'm just missing something. Any ideas?
Thanks in advance.
Can you please share the following info:
1. The output of the following command from non working agent host.
# rpm -qa | grep ambari
2. Please try restarting agent once again and then attach the following log file from non working agent host to this hcc thread
# ls -l /var/log/ambari-agent/ambari-agent.log
It will also be great if you can share the "/var/log/ambari-server/ambari-server.log" so thatw e can match the heartbeat requst/response time with the agent log
3. Please share the output of the following command from the Agent host.
# telnet $AMBARI_HOST 8440 (OR)# nc -v $AMBARI_HOST 8440
4. Please verify if the FQDN of the ambari agent host is correct and matching the hostname which ambari us showing in it's Host page?
Run the following command on the Ambari Agent host to find the FQDN
# hostname -f
6. Check the Ambari UI to verify if the above Hostname is matching ? Or somehow the FQDN of the host got changed recently?
Following API call also can give the Host details with their registered hostname
# curl -u admin:admin -H "X-Requested-By: ambari" -X GET http://$AMBARI_HOST:8080/api/v1/hosts
Thanks for your reply, @Jay Kumar SenSharma
> rpm -qa | grep ambari ambari-metrics-hadoop-sink-18.104.22.168-169.x86_64 ambari-agent-22.214.171.124-169.x86_64 ambari-metrics-grafana-126.96.36.199-169.x86_64 ambari-metrics-monitor-188.8.131.52-169.x86_64 ambari-server-184.108.40.206-169.x86_64 ambari-metrics-collector-220.127.116.11-169.x86_64
2. Attached (agent and server logs)
> telnet hadooptest1 8440 Trying 10.165.0.11... Connected to hadooptest1. Escape character is '^]'.
I double-checked here, and the agent has the correct server IP.
I uploaded the log files here, if you need to take a look:
Thanks again for your help!