Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

unable to join Amabari server to cluster

Solved Go to solution

Re: unable to join Amabari server to cluster

Super Mentor

@Tom Burke

Looks like ambari agent is already installed on your ambari server host. So just try to start it and then see it it is starting fine without any error.

# ambari-agent start;  tail -f /var/log/ambari-agent/ambari-agent.log

.

Re: unable to join Amabari server to cluster

New Contributor

Hello Jay log file from compute1

INFO 2019-02-05 16:14:41,688 security.py:135 - Event to server at /reports/host_status (correlation_id=74643): {'agentEnv': {'transparentHugePage': 'madvise', 'hostHealth': {'agentTimeStampAtReporting': 1549412081679, 'liveServices': [{'status': 'Healthy', 'name': 'ntp or chrony', 'desc': ''}]}, 'reverseLookup': True, 'umask': '18', 'hasUnlimitedJcePolicy': True, 'alternatives': [], 'firewallName': 'ufw', 'stackFoldersAndFiles': [], 'existingUsers': [], 'firewallRunning': False}, 'mounts': [{'available': '5028752436', 'used': '28126224', 'percent': '1%', 'device': '/dev/sda2', 'mountpoint': '/', 'type': 'ext4', 'size': '5325330344'}]}
INFO 2019-02-05 16:14:41,690 __init__.py:82 - Event from server at /user/ (correlation_id=74643): {u'status': u'OK'}
INFO 2019-02-05 16:14:50,711 security.py:135 - Event to server at /heartbeat (correlation_id=74644): {'id': 59869}
INFO 2019-02-05 16:14:50,713 __init__.py:82 - Event from server at /user/ (correlation_id=74644): {u'status': u'OK', u'id': 59870}
INFO 2019-02-05 16:15:00,715 security.py:135 - Event to server at /heartbeat (correlation_id=74645): {'id': 59870}
INFO 2019-02-05 16:15:00,718 __init__.py:82 - Event from server at /user/ (correlation_id=74645): {u'status': u'OK', u'id': 59871}
INFO 2019-02-05 16:15:10,719 security.py:135 - Event to server at /heartbeat (correlation_id=74646): {'id': 59871}
INFO 2019-02-05 16:15:10,724 __init__.py:82 - Event from server at /user/ (correlation_id=74646): {u'status': u'OK', u'id': 59872}
INFO 2019-02-05 16:15:20,729 security.py:135 - Event to server at /heartbeat (correlation_id=74647): {'id': 59872}
INFO 2019-02-05 16:15:20,731 __init__.py:82 - Event from server at /user/ (correlation_id=74647): {u'status': u'OK', u'id': 59873}
INFO 2019-02-05 16:15:30,733 security.py:135 - Event to server at /heartbeat (correlation_id=74648): {'id': 59873}
INFO 2019-02-05 16:15:30,734 __init__.py:82 - Event from server at /user/ (correlation_id=74648): {u'status': u'OK', u'id': 59874}
INFO 2019-02-05 16:15:40,735 security.py:135 - Event to server at /heartbeat (correlation_id=74649): {'id': 59874}
INFO 2019-02-05 16:15:40,736 __init__.py:82 - Event from server at /user/ (correlation_id=74649): {u'status': u'OK', u'id': 59875}
INFO 2019-02-05 16:15:41,938 Hardware.py:188 - Some mount points were ignored: /dev, /run, /dev/shm, /run/lock, /sys/fs/cgroup, /snap/core/6130, /run/user/1003, /snap/core/6259, /snap/core/6350, /run/user/1008, /run/user/1019, /run/user/1013, /run/user/1021, /run/user/1015, /run/user/1023, /run/user/0
INFO 2019-02-05 16:15:41,938 security.py:135 - Event to server at /reports/host_status (correlation_id=74650): {'agentEnv': {'transparentHugePage': 'madvise', 'hostHealth': {'agentTimeStampAtReporting': 1549412141929, 'liveServices': [{'status': 'Healthy', 'name': 'ntp or chrony', 'desc': ''}]}, 'reverseLookup': True, 'umask': '18', 'hasUnlimitedJcePolicy': True, 'alternatives': [], 'firewallName': 'ufw', 'stackFoldersAndFiles': [], 'existingUsers': [], 'firewallRunning': False}, 'mounts': [{'available': '5028752424', 'used': '28126236', 'percent': '1%', 'device': '/dev/sda2', 'mountpoint': '/', 'type': 'ext4', 'size': '5325330344'}]}
INFO 2019-02-05 16:15:41,940 __init__.py:82 - Event from server at /user/ (correlation_id=74650): {u'status': u'OK'}
INFO 2019-02-05 16:15:50,737 security.py:135 - Event to server at /heartbeat (correlation_id=74651): {'id': 59875}
INFO 2019-02-05 16:15:50,738 __init__.py:82 - Event from server at /user/ (correlation_id=74651): {u'status': u'OK', u'id': 59876}
INFO 2019-02-05 16:16:00,739 security.py:135 - Event to server at /heartbeat (correlation_id=74652): {'id': 59876}
INFO 2019-02-05 16:16:00,740 __init__.py:82 - Event from server at /user/ (correlation_id=74652): {u'status': u'OK', u'id': 59877}
INFO 2019-02-05 16:16:08,628 security.py:135 - Event to server at /reports/alerts_status (correlation_id=74653): [{'name': u'datanode_storage', 'timestamp': 1549412167525L, 'clusterId': '2', 'definitionId': 19, 'state': 'OK', 'text': '...'}, {'name': u'datanode_heap_usage', 'timestamp': 1549412167519L, 'clusterId': '2', 'definitionId': 10, 'state': 'OK', 'text': '...'}]
INFO 2019-02-05 16:16:08,630 __init__.py:82 - Event from server at /user/ (correlation_id=74653): {u'status': u'OK'}
INFO 2019-02-05 16:16:10,743 security.py:135 - Event to server at /heartbeat (correlation_id=74654): {'id': 59877}
INFO 2019-02-05 16:16:10,744 __init__.py:82 - Event from server at /user/ (correlation_id=74654): {u'status': u'OK', u'id': 59878}
INFO 2019-02-05 16:16:20,745 security.py:135 - Event to server at /heartbeat (correlation_id=74655): {'id': 59878}
INFO 2019-02-05 16:16:20,746 __init__.py:82 - Event from server at /user/ (correlation_id=74655): {u'status': u'OK', u'id': 59879}
INFO 2019-02-05 16:16:30,747 security.py:135 - Event to server at /heartbeat (correlation_id=74656): {'id': 59879}
INFO 2019-02-05 16:16:30,748 __init__.py:82 - Event from server at /user/ (correlation_id=74656): {u'status': u'OK', u'id': 59880}
INFO 2019-02-05 16:16:40,751 security.py:135 - Event to server at /heartbeat (correlation_id=74657): {'id': 59880}
INFO 2019-02-05 16:16:40,752 __init__.py:82 - Event from server at /user/ (correlation_id=74657): {u'status': u'OK', u'id': 59881}
INFO 2019-02-05 16:16:42,184 Hardware.py:188 - Some mount points were ignored: /dev, /run, /dev/shm, /run/lock, /sys/fs/cgroup, /snap/core/6130, /run/user/1003, /snap/core/6259, /snap/core/6350, /run/user/1008, /run/user/1019, /run/user/1013, /run/user/1021, /run/user/1015, /run/user/1023, /run/user/0
INFO 2019-02-05 16:16:42,184 security.py:135 - Event to server at /reports/host_status (correlation_id=74658): {'agentEnv': {'transparentHugePage': 'madvise', 'hostHealth': {'agentTimeStampAtReporting': 1549412202175, 'liveServices': [{'status': 'Healthy', 'name': 'ntp or chrony', 'desc': ''}]}, 'reverseLookup': True, 'umask': '18', 'hasUnlimitedJcePolicy': True, 'alternatives': [], 'firewallName': 'ufw', 'stackFoldersAndFiles': [], 'existingUsers': [], 'firewallRunning': False}, 'mounts': [{'available': '5028752404', 'used': '28126256', 'percent': '1%', 'device': '/dev/sda2', 'mountpoint': '/', 'type': 'ext4', 'size': '5325330344'}]}
INFO 2019-02-05 16:16:42,186 __init__.py:82 - Event from server at /user/ (correlation_id=74658): {u'status': u'OK'}
INFO 2019-02-05 16:16:50,753 security.py:135 - Event to server at /heartbeat (correlation_id=74659): {'id': 59881}
INFO 2019-02-05 16:16:50,755 __init__.py:82 - Event from server at /user/ (correlation_id=74659): {u'status': u'OK', u'id': 59882}
INFO 2019-02-05 16:17:00,757 security.py:135 - Event to server at /heartbeat (correlation_id=74660): {'id': 59882}
INFO 2019-02-05 16:17:00,758 __init__.py:82 - Event from server at /user/ (correlation_id=74660): {u'status': u'OK', u'id': 59883}
INFO 2019-02-05 16:17:10,759 security.py:135 - Event to server at /heartbeat (correlation_id=74661): {'id': 59883}
INFO 2019-02-05 16:17:10,761 __init__.py:82 - Event from server at /user/ (correlation_id=74661): {u'status': u'OK', u'id': 59884}
INFO 2019-02-05 16:17:20,763 security.py:135 - Event to server at /heartbeat (correlation_id=74662): {'id': 59884}
INFO 2019-02-05 16:17:20,765 __init__.py:82 - Event from server at /user/ (correlation_id=74662): {u'status': u'OK', u'id': 59885}
INFO 2019-02-05 16:17:30,767 security.py:135 - Event to server at /heartbeat (correlation_id=74663): {'id': 59885}
INFO 2019-02-05 16:17:30,769 __init__.py:82 - Event from server at /user/ (correlation_id=74663): {u'status': u'OK', u'id': 59886}
INFO 2019-02-05 16:17:40,769 security.py:135 - Event to server at /heartbeat (correlation_id=74664): {'id': 59886}
INFO 2019-02-05 16:17:40,771 __init__.py:82 - Event from server at /user/ (correlation_id=74664): {u'status': u'OK', u'id': 59887}
INFO 2019-02-05 16:17:42,427 Hardware.py:188 - Some mount points were ignored: /dev, /run, /dev/shm, /run/lock, /sys/fs/cgroup, /snap/core/6130, /run/user/1003, /snap/core/6259, /snap/core/6350, /run/user/1008, /run/user/1019, /run/user/1013, /run/user/1021, /run/user/1015, /run/user/1023, /run/user/0
INFO 2019-02-05 16:17:42,427 security.py:135 - Event to server at /reports/host_status (correlation_id=74665): {'agentEnv': {'transparentHugePage': 'madvise', 'hostHealth': {'agentTimeStampAtReporting': 1549412262418, 'liveServices': [{'status': 'Healthy', 'name': 'ntp or chrony', 'desc': ''}]}, 'reverseLookup': True, 'umask': '18', 'hasUnlimitedJcePolicy': True, 'alternatives': [], 'firewallName': 'ufw', 'stackFoldersAndFiles': [], 'existingUsers': [], 'firewallRunning': False}, 'mounts': [{'available': '5028752380', 'used': '28126280', 'percent': '1%', 'device': '/dev/sda2', 'mountpoint': '/', 'type': 'ext4', 'size': '5325330344'}]}
INFO 2019-02-05 16:17:42,429 __init__.py:82 - Event from server at /user/ (correlation_id=74665): {u'status': u'OK'}
INFO 2019-02-05 16:17:50,774 security.py:135 - Event to server at /heartbeat (correlation_id=74666): {'id': 59887}
INFO 2019-02-05 16:17:50,775 __init__.py:82 - Event from server at /user/ (correlation_id=74666): {u'status': u'OK', u'id': 59888}
INFO 2019-02-05 16:18:00,778 security.py:135 - Event to server at /heartbeat (correlation_id=74667): {'id': 59888}
INFO 2019-02-05 16:18:00,781 __init__.py:82 - Event from server at /user/ (correlation_id=74667): {u'status': u'OK', u'id': 59889}
INFO 2019-02-05 16:18:08,633 security.py:135 - Event to server at /reports/alerts_status (correlation_id=74668): [{'name': u'datanode_heap_usage', 'timestamp': 1549412287525L, 'clusterId': '2', 'definitionId': 10, 'state': 'OK', 'text': '...'}, {'name': u'datanode_storage', 'timestamp': 1549412287526L, 'clusterId': '2', 'definitionId': 19, 'state': 'OK', 'text': '...'}]
INFO 2019-02-05 16:18:08,635 __init__.py:82 - Event from server at /user/ (correlation_id=74668): {u'status': u'OK'}
INFO 2019-02-05 16:18:10,782 security.py:135 - Event to server at /heartbeat (correlation_id=74669): {'id': 59889}
INFO 2019-02-05 16:18:10,783 __init__.py:82 - Event from server at /user/ (correlation_id=74669): {u'status': u'OK', u'id': 59890}
INFO 2019-02-05 16:18:20,784 security.py:135 - Event to server at /heartbeat (correlation_id=74670): {'id': 59890}
INFO 2019-02-05 16:18:20,785 __init__.py:82 - Event from server at /user/ (correlation_id=74670): {u'status': u'OK', u'id': 59891}
INFO 2019-02-05 16:18:30,788 security.py:135 - Event to server at /heartbeat (correlation_id=74671): {'id': 59891}
INFO 2019-02-05 16:18:30,790 __init__.py:82 - Event from server at /user/ (correlation_id=74671): {u'status': u'OK', u'id': 59892}

Re: unable to join Amabari server to cluster

Super Mentor

@Tom Burke

I think everything is fine and i see no errors any more in the UI operational logs or in the ambari-agent logs.

So it looks good to me.

Do you still see any issue?

Re: unable to join Amabari server to cluster

New Contributor

Actually I cannot find the ambari agent on the ambari server host. the log I showed is from another member server,

So just to backup a se. when I try to enable the kerberos... I get the hostname fail as the first error , this happens on the ambari server it says.... this is what led me to the conclusion that I need to add the ambari server tot he cluster which is why I opened that other question. maybe this pic will helphorton-err-obfu-1.pdf

Re: unable to join Amabari server to cluster

New Contributor

I sort of expected to see the ambari server listed in Hosts....no-ambari-in-hosts.png

incorect assumption? anyhow I suppose does nto matter since you said is not needed to enable kerberos. we can put this to bed if you like ,thanks!

Re: unable to join Amabari server to cluster

Super Mentor

@Tom Burke

Looks like the FQDN is not set correctly on your failing Node.

Please run the following commands to verify if the FQDN is setup correctly? (hostname and FQDN are not same)

# python<<<"import socket;print socket.getfqdn();"
(OR)
# hostname -f
# hostname

If you find a difference in the FQDN then please set the FQDN of your host correctly.

You can find the details here about hostname and public_hostname: https://community.hortonworks.com/content/kbentry/42872/why-ambari-host-might-have-different-public-...

Re: unable to join Amabari server to cluster

Super Mentor

@Tom Burke And regarding the SSL related issue.

You will need to make sure that you configure a truststore in ambari Server and import the LDAP/AD certificate to Ambari Server's truststore to fix the following message:

Failed to connect to KDC - Failed to communicate with the Active Directory at ldaps://xxx.yyy.com:636: simple bind failed: xxx.yyy.com:636

.

Please see: https://community.hortonworks.com/content/supportkb/148572/failed-to-connect-to-kdc-make-sure-the-se...

Highlighted

Re: unable to join Amabari server to cluster

New Contributor

Hi Jay ,, sorry, but all 3 hostname command return same

ambari.example.com

Re: unable to join Amabari server to cluster

New Contributor

yes I did the import to trust store fine. just does not work.

Re: unable to join Amabari server to cluster

Super Mentor

@Tom Burke

You can also open the same URL from the Browser where you have logged in the Ambari UI as well.

http://ambari.example.com:8443/api/v1/clusters/cluster-name/hosts?fields=Hosts/ip,Hosts/host_name

.

(OR) You can use the following curl call to produce the JSON output to some file like "/tmp/hosts.json". Also pleas emake sure to use the correct cluster name in the same URL.

# curl -k -H "X-Requested-By: ambari" -u admin:admin "http://ambari.example.com:8443/api/v1/clusters/cluster-name/hosts?fields=Hosts/ip,Hosts/host_name" -o "/tmp/hosts.json"

.