Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

unable to join Amabari server to cluster

avatar
Explorer

step 1 get rsa key from umbari host

	# cat /root/.ssh/id_rsa

step 2 Amabri web UI choose Hosts...Action Add Host. enter ambari.example.com in Target Hosts Field

Paste the key from step 1 into where it asks for 'ssh prv key'

step 3 click button Register and confirm....

returns Failed

		==========================
Creating target directory...
==========================
Command start time 2019-02-05 14:23:08
root@ambari.example.com: Permission denied (publickey,password).

		SSH command execution finished
host=ambari.example.com, exitcode=255

		Command end time 2019-02-05 14:23:08

ERROR: Bootstrap of host ambari.example.com fails because previous action finished with non-zero exit code (255)

		ERROR MESSAGE: root@ambari.example.com: Permission denied (publickey,password).

STDOUT: 
root@ambari.example.com: Permission denied (publickey,password).
	
OK
1 ACCEPTED SOLUTION

avatar
Master Mentor

@Tom Burke

Are you really checking what kind of command aere you executing in curl and what does every curl argument means?

Before attaching the "hosts4.json" if you would have just checked the content of this file then you would know that the credentials which you are entering in curl command are Wrong ambari admin credential.

# cat hosts4.json 
{
  "status": 403,
  "message": "Unable to sign in. Invalid username/password combination."
}

.

As i do not know your ambari admin credentials hence i just gave you a dummy curl command and expected that you will change the values according to your cluster.

View solution in original post

29 REPLIES 29

avatar
Master Mentor

@Tom Burke

Looks like ambari agent is already installed on your ambari server host. So just try to start it and then see it it is starting fine without any error.

# ambari-agent start;  tail -f /var/log/ambari-agent/ambari-agent.log

.

avatar
Explorer

Hello Jay log file from compute1

INFO 2019-02-05 16:14:41,688 security.py:135 - Event to server at /reports/host_status (correlation_id=74643): {'agentEnv': {'transparentHugePage': 'madvise', 'hostHealth': {'agentTimeStampAtReporting': 1549412081679, 'liveServices': [{'status': 'Healthy', 'name': 'ntp or chrony', 'desc': ''}]}, 'reverseLookup': True, 'umask': '18', 'hasUnlimitedJcePolicy': True, 'alternatives': [], 'firewallName': 'ufw', 'stackFoldersAndFiles': [], 'existingUsers': [], 'firewallRunning': False}, 'mounts': [{'available': '5028752436', 'used': '28126224', 'percent': '1%', 'device': '/dev/sda2', 'mountpoint': '/', 'type': 'ext4', 'size': '5325330344'}]}
INFO 2019-02-05 16:14:41,690 __init__.py:82 - Event from server at /user/ (correlation_id=74643): {u'status': u'OK'}
INFO 2019-02-05 16:14:50,711 security.py:135 - Event to server at /heartbeat (correlation_id=74644): {'id': 59869}
INFO 2019-02-05 16:14:50,713 __init__.py:82 - Event from server at /user/ (correlation_id=74644): {u'status': u'OK', u'id': 59870}
INFO 2019-02-05 16:15:00,715 security.py:135 - Event to server at /heartbeat (correlation_id=74645): {'id': 59870}
INFO 2019-02-05 16:15:00,718 __init__.py:82 - Event from server at /user/ (correlation_id=74645): {u'status': u'OK', u'id': 59871}
INFO 2019-02-05 16:15:10,719 security.py:135 - Event to server at /heartbeat (correlation_id=74646): {'id': 59871}
INFO 2019-02-05 16:15:10,724 __init__.py:82 - Event from server at /user/ (correlation_id=74646): {u'status': u'OK', u'id': 59872}
INFO 2019-02-05 16:15:20,729 security.py:135 - Event to server at /heartbeat (correlation_id=74647): {'id': 59872}
INFO 2019-02-05 16:15:20,731 __init__.py:82 - Event from server at /user/ (correlation_id=74647): {u'status': u'OK', u'id': 59873}
INFO 2019-02-05 16:15:30,733 security.py:135 - Event to server at /heartbeat (correlation_id=74648): {'id': 59873}
INFO 2019-02-05 16:15:30,734 __init__.py:82 - Event from server at /user/ (correlation_id=74648): {u'status': u'OK', u'id': 59874}
INFO 2019-02-05 16:15:40,735 security.py:135 - Event to server at /heartbeat (correlation_id=74649): {'id': 59874}
INFO 2019-02-05 16:15:40,736 __init__.py:82 - Event from server at /user/ (correlation_id=74649): {u'status': u'OK', u'id': 59875}
INFO 2019-02-05 16:15:41,938 Hardware.py:188 - Some mount points were ignored: /dev, /run, /dev/shm, /run/lock, /sys/fs/cgroup, /snap/core/6130, /run/user/1003, /snap/core/6259, /snap/core/6350, /run/user/1008, /run/user/1019, /run/user/1013, /run/user/1021, /run/user/1015, /run/user/1023, /run/user/0
INFO 2019-02-05 16:15:41,938 security.py:135 - Event to server at /reports/host_status (correlation_id=74650): {'agentEnv': {'transparentHugePage': 'madvise', 'hostHealth': {'agentTimeStampAtReporting': 1549412141929, 'liveServices': [{'status': 'Healthy', 'name': 'ntp or chrony', 'desc': ''}]}, 'reverseLookup': True, 'umask': '18', 'hasUnlimitedJcePolicy': True, 'alternatives': [], 'firewallName': 'ufw', 'stackFoldersAndFiles': [], 'existingUsers': [], 'firewallRunning': False}, 'mounts': [{'available': '5028752424', 'used': '28126236', 'percent': '1%', 'device': '/dev/sda2', 'mountpoint': '/', 'type': 'ext4', 'size': '5325330344'}]}
INFO 2019-02-05 16:15:41,940 __init__.py:82 - Event from server at /user/ (correlation_id=74650): {u'status': u'OK'}
INFO 2019-02-05 16:15:50,737 security.py:135 - Event to server at /heartbeat (correlation_id=74651): {'id': 59875}
INFO 2019-02-05 16:15:50,738 __init__.py:82 - Event from server at /user/ (correlation_id=74651): {u'status': u'OK', u'id': 59876}
INFO 2019-02-05 16:16:00,739 security.py:135 - Event to server at /heartbeat (correlation_id=74652): {'id': 59876}
INFO 2019-02-05 16:16:00,740 __init__.py:82 - Event from server at /user/ (correlation_id=74652): {u'status': u'OK', u'id': 59877}
INFO 2019-02-05 16:16:08,628 security.py:135 - Event to server at /reports/alerts_status (correlation_id=74653): [{'name': u'datanode_storage', 'timestamp': 1549412167525L, 'clusterId': '2', 'definitionId': 19, 'state': 'OK', 'text': '...'}, {'name': u'datanode_heap_usage', 'timestamp': 1549412167519L, 'clusterId': '2', 'definitionId': 10, 'state': 'OK', 'text': '...'}]
INFO 2019-02-05 16:16:08,630 __init__.py:82 - Event from server at /user/ (correlation_id=74653): {u'status': u'OK'}
INFO 2019-02-05 16:16:10,743 security.py:135 - Event to server at /heartbeat (correlation_id=74654): {'id': 59877}
INFO 2019-02-05 16:16:10,744 __init__.py:82 - Event from server at /user/ (correlation_id=74654): {u'status': u'OK', u'id': 59878}
INFO 2019-02-05 16:16:20,745 security.py:135 - Event to server at /heartbeat (correlation_id=74655): {'id': 59878}
INFO 2019-02-05 16:16:20,746 __init__.py:82 - Event from server at /user/ (correlation_id=74655): {u'status': u'OK', u'id': 59879}
INFO 2019-02-05 16:16:30,747 security.py:135 - Event to server at /heartbeat (correlation_id=74656): {'id': 59879}
INFO 2019-02-05 16:16:30,748 __init__.py:82 - Event from server at /user/ (correlation_id=74656): {u'status': u'OK', u'id': 59880}
INFO 2019-02-05 16:16:40,751 security.py:135 - Event to server at /heartbeat (correlation_id=74657): {'id': 59880}
INFO 2019-02-05 16:16:40,752 __init__.py:82 - Event from server at /user/ (correlation_id=74657): {u'status': u'OK', u'id': 59881}
INFO 2019-02-05 16:16:42,184 Hardware.py:188 - Some mount points were ignored: /dev, /run, /dev/shm, /run/lock, /sys/fs/cgroup, /snap/core/6130, /run/user/1003, /snap/core/6259, /snap/core/6350, /run/user/1008, /run/user/1019, /run/user/1013, /run/user/1021, /run/user/1015, /run/user/1023, /run/user/0
INFO 2019-02-05 16:16:42,184 security.py:135 - Event to server at /reports/host_status (correlation_id=74658): {'agentEnv': {'transparentHugePage': 'madvise', 'hostHealth': {'agentTimeStampAtReporting': 1549412202175, 'liveServices': [{'status': 'Healthy', 'name': 'ntp or chrony', 'desc': ''}]}, 'reverseLookup': True, 'umask': '18', 'hasUnlimitedJcePolicy': True, 'alternatives': [], 'firewallName': 'ufw', 'stackFoldersAndFiles': [], 'existingUsers': [], 'firewallRunning': False}, 'mounts': [{'available': '5028752404', 'used': '28126256', 'percent': '1%', 'device': '/dev/sda2', 'mountpoint': '/', 'type': 'ext4', 'size': '5325330344'}]}
INFO 2019-02-05 16:16:42,186 __init__.py:82 - Event from server at /user/ (correlation_id=74658): {u'status': u'OK'}
INFO 2019-02-05 16:16:50,753 security.py:135 - Event to server at /heartbeat (correlation_id=74659): {'id': 59881}
INFO 2019-02-05 16:16:50,755 __init__.py:82 - Event from server at /user/ (correlation_id=74659): {u'status': u'OK', u'id': 59882}
INFO 2019-02-05 16:17:00,757 security.py:135 - Event to server at /heartbeat (correlation_id=74660): {'id': 59882}
INFO 2019-02-05 16:17:00,758 __init__.py:82 - Event from server at /user/ (correlation_id=74660): {u'status': u'OK', u'id': 59883}
INFO 2019-02-05 16:17:10,759 security.py:135 - Event to server at /heartbeat (correlation_id=74661): {'id': 59883}
INFO 2019-02-05 16:17:10,761 __init__.py:82 - Event from server at /user/ (correlation_id=74661): {u'status': u'OK', u'id': 59884}
INFO 2019-02-05 16:17:20,763 security.py:135 - Event to server at /heartbeat (correlation_id=74662): {'id': 59884}
INFO 2019-02-05 16:17:20,765 __init__.py:82 - Event from server at /user/ (correlation_id=74662): {u'status': u'OK', u'id': 59885}
INFO 2019-02-05 16:17:30,767 security.py:135 - Event to server at /heartbeat (correlation_id=74663): {'id': 59885}
INFO 2019-02-05 16:17:30,769 __init__.py:82 - Event from server at /user/ (correlation_id=74663): {u'status': u'OK', u'id': 59886}
INFO 2019-02-05 16:17:40,769 security.py:135 - Event to server at /heartbeat (correlation_id=74664): {'id': 59886}
INFO 2019-02-05 16:17:40,771 __init__.py:82 - Event from server at /user/ (correlation_id=74664): {u'status': u'OK', u'id': 59887}
INFO 2019-02-05 16:17:42,427 Hardware.py:188 - Some mount points were ignored: /dev, /run, /dev/shm, /run/lock, /sys/fs/cgroup, /snap/core/6130, /run/user/1003, /snap/core/6259, /snap/core/6350, /run/user/1008, /run/user/1019, /run/user/1013, /run/user/1021, /run/user/1015, /run/user/1023, /run/user/0
INFO 2019-02-05 16:17:42,427 security.py:135 - Event to server at /reports/host_status (correlation_id=74665): {'agentEnv': {'transparentHugePage': 'madvise', 'hostHealth': {'agentTimeStampAtReporting': 1549412262418, 'liveServices': [{'status': 'Healthy', 'name': 'ntp or chrony', 'desc': ''}]}, 'reverseLookup': True, 'umask': '18', 'hasUnlimitedJcePolicy': True, 'alternatives': [], 'firewallName': 'ufw', 'stackFoldersAndFiles': [], 'existingUsers': [], 'firewallRunning': False}, 'mounts': [{'available': '5028752380', 'used': '28126280', 'percent': '1%', 'device': '/dev/sda2', 'mountpoint': '/', 'type': 'ext4', 'size': '5325330344'}]}
INFO 2019-02-05 16:17:42,429 __init__.py:82 - Event from server at /user/ (correlation_id=74665): {u'status': u'OK'}
INFO 2019-02-05 16:17:50,774 security.py:135 - Event to server at /heartbeat (correlation_id=74666): {'id': 59887}
INFO 2019-02-05 16:17:50,775 __init__.py:82 - Event from server at /user/ (correlation_id=74666): {u'status': u'OK', u'id': 59888}
INFO 2019-02-05 16:18:00,778 security.py:135 - Event to server at /heartbeat (correlation_id=74667): {'id': 59888}
INFO 2019-02-05 16:18:00,781 __init__.py:82 - Event from server at /user/ (correlation_id=74667): {u'status': u'OK', u'id': 59889}
INFO 2019-02-05 16:18:08,633 security.py:135 - Event to server at /reports/alerts_status (correlation_id=74668): [{'name': u'datanode_heap_usage', 'timestamp': 1549412287525L, 'clusterId': '2', 'definitionId': 10, 'state': 'OK', 'text': '...'}, {'name': u'datanode_storage', 'timestamp': 1549412287526L, 'clusterId': '2', 'definitionId': 19, 'state': 'OK', 'text': '...'}]
INFO 2019-02-05 16:18:08,635 __init__.py:82 - Event from server at /user/ (correlation_id=74668): {u'status': u'OK'}
INFO 2019-02-05 16:18:10,782 security.py:135 - Event to server at /heartbeat (correlation_id=74669): {'id': 59889}
INFO 2019-02-05 16:18:10,783 __init__.py:82 - Event from server at /user/ (correlation_id=74669): {u'status': u'OK', u'id': 59890}
INFO 2019-02-05 16:18:20,784 security.py:135 - Event to server at /heartbeat (correlation_id=74670): {'id': 59890}
INFO 2019-02-05 16:18:20,785 __init__.py:82 - Event from server at /user/ (correlation_id=74670): {u'status': u'OK', u'id': 59891}
INFO 2019-02-05 16:18:30,788 security.py:135 - Event to server at /heartbeat (correlation_id=74671): {'id': 59891}
INFO 2019-02-05 16:18:30,790 __init__.py:82 - Event from server at /user/ (correlation_id=74671): {u'status': u'OK', u'id': 59892}

avatar
Master Mentor

@Tom Burke

I think everything is fine and i see no errors any more in the UI operational logs or in the ambari-agent logs.

So it looks good to me.

Do you still see any issue?

avatar
Explorer

Actually I cannot find the ambari agent on the ambari server host. the log I showed is from another member server,

So just to backup a se. when I try to enable the kerberos... I get the hostname fail as the first error , this happens on the ambari server it says.... this is what led me to the conclusion that I need to add the ambari server tot he cluster which is why I opened that other question. maybe this pic will helphorton-err-obfu-1.pdf

avatar
Explorer

I sort of expected to see the ambari server listed in Hosts....no-ambari-in-hosts.png

incorect assumption? anyhow I suppose does nto matter since you said is not needed to enable kerberos. we can put this to bed if you like ,thanks!

avatar
Master Mentor

@Tom Burke

Looks like the FQDN is not set correctly on your failing Node.

Please run the following commands to verify if the FQDN is setup correctly? (hostname and FQDN are not same)

# python<<<"import socket;print socket.getfqdn();"
(OR)
# hostname -f
# hostname

If you find a difference in the FQDN then please set the FQDN of your host correctly.

You can find the details here about hostname and public_hostname: https://community.hortonworks.com/content/kbentry/42872/why-ambari-host-might-have-different-public-...

avatar
Master Mentor

@Tom Burke And regarding the SSL related issue.

You will need to make sure that you configure a truststore in ambari Server and import the LDAP/AD certificate to Ambari Server's truststore to fix the following message:

Failed to connect to KDC - Failed to communicate with the Active Directory at ldaps://xxx.yyy.com:636: simple bind failed: xxx.yyy.com:636

.

Please see: https://community.hortonworks.com/content/supportkb/148572/failed-to-connect-to-kdc-make-sure-the-se...

avatar
Explorer

Hi Jay ,, sorry, but all 3 hostname command return same

ambari.example.com

avatar
Explorer

yes I did the import to trust store fine. just does not work.

avatar
Master Mentor

@Tom Burke

You can also open the same URL from the Browser where you have logged in the Ambari UI as well.

http://ambari.example.com:8443/api/v1/clusters/cluster-name/hosts?fields=Hosts/ip,Hosts/host_name

.

(OR) You can use the following curl call to produce the JSON output to some file like "/tmp/hosts.json". Also pleas emake sure to use the correct cluster name in the same URL.

# curl -k -H "X-Requested-By: ambari" -u admin:admin "http://ambari.example.com:8443/api/v1/clusters/cluster-name/hosts?fields=Hosts/ip,Hosts/host_name" -o "/tmp/hosts.json"

.