Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Ambari-Agent registration step fails on RHEL 7.4 EC2 while trying to create HDF 3.2 cluster

avatar
New Member

I've installed Ambari Server and followed all pre-requisite steps, When trying to create HDF 3.2 cluster via Ambari wizard, the Ambari Agent is installed and started but registration step fails:

Creating target directory...

==========================

Command start time 2018-09-10 23:54:57

Connection to ip-10-40-145-105.ec2.internal closed.

SSH command execution finished

host=ip-10-40-145-105.ec2.internal, exitcode=0

Command end time 2018-09-10 23:54:58

==========================

Copying ambari sudo script...

==========================

Command start time 2018-09-10 23:54:58

scp /var/lib/ambari-server/ambari-sudo.sh

host=ip-10-40-145-105.ec2.internal, exitcode=0

Command end time 2018-09-10 23:54:58

==========================

Copying common functions script...

==========================

Command start time 2018-09-10 23:54:58

scp /usr/lib/ambari-server/lib/ambari_commons

host=ip-10-40-145-105.ec2.internal, exitcode=0

Command end time 2018-09-10 23:54:58

==========================

Copying create-python-wrap script...

==========================

Command start time 2018-09-10 23:54:58

scp /var/lib/ambari-server/create-python-wrap.sh

host=ip-10-40-145-105.ec2.internal, exitcode=0

Command end time 2018-09-10 23:54:59

==========================

Copying OS type check script...

==========================

Command start time 2018-09-10 23:54:59

scp /usr/lib/ambari-server/lib/ambari_server/os_check_type.py

host=ip-10-40-145-105.ec2.internal, exitcode=0

Command end time 2018-09-10 23:54:59

==========================

Running create-python-wrap script...

==========================

Command start time 2018-09-10 23:54:59

Connection to ip-10-40-145-105.ec2.internal closed.

SSH command execution finished

host=ip-10-40-145-105.ec2.internal, exitcode=0

Command end time 2018-09-10 23:54:59

==========================

Running OS type check...

==========================

Command start time 2018-09-10 23:54:59

Cluster primary/cluster OS family is redhat7 and local/current OS family is redhat7

Connection to ip-10-40-145-105.ec2.internal closed.

SSH command execution finished

host=ip-10-40-145-105.ec2.internal, exitcode=0

Command end time 2018-09-10 23:54:59

==========================

Checking 'sudo' package on remote host...

==========================

Command start time 2018-09-10 23:54:59

Connection to ip-10-40-145-105.ec2.internal closed.

SSH command execution finished

host=ip-10-40-145-105.ec2.internal, exitcode=0

Command end time 2018-09-10 23:55:00

==========================

Copying repo file to 'tmp' folder...

==========================

Command start time 2018-09-10 23:55:00

scp /etc/yum.repos.d/ambari.repo

host=ip-10-40-145-105.ec2.internal, exitcode=0

Command end time 2018-09-10 23:55:00

==========================

Moving file to repo dir...

==========================

Command start time 2018-09-10 23:55:00

Connection to ip-10-40-145-105.ec2.internal closed.

SSH command execution finished

host=ip-10-40-145-105.ec2.internal, exitcode=0

Command end time 2018-09-10 23:55:00

==========================

Changing permissions for ambari.repo...

==========================

Command start time 2018-09-10 23:55:00

Connection to ip-10-40-145-105.ec2.internal closed.

SSH command execution finished

host=ip-10-40-145-105.ec2.internal, exitcode=0

Command end time 2018-09-10 23:55:01

==========================

Copying setup script file...

==========================

Command start time 2018-09-10 23:55:01

scp /usr/lib/ambari-server/lib/ambari_server/setupAgent.py

host=ip-10-40-145-105.ec2.internal, exitcode=0

Command end time 2018-09-10 23:55:01

==========================

Running setup agent script...

==========================

Command start time 2018-09-10 23:55:01

("WARNING 2018-09-10 23:55:35,248 shell.py:822 - can not switch user for RUN_COMMAND.

WARNING 2018-09-10 23:55:35,352 shell.py:822 - can not switch user for RUN_COMMAND.

INFO 2018-09-10 23:55:35,456 main.py:311 - Agent not going to die gracefully, going to execute kill -9

WARNING 2018-09-10 23:55:35,456 shell.py:822 - can not switch user for RUN_COMMAND.

INFO 2018-09-10 23:55:35,460 main.py:322 - Agent stopped successfully by kill -9, exiting.

INFO 2018-09-10 23:55:35,460 ExitHelper.py:57 - Performing cleanup before exiting...

INFO 2018-09-10 23:55:35,461 AlertSchedulerHandler.py:159 - [AlertScheduler] Stopped the alert scheduler.

INFO 2018-09-10 23:55:35,461 AlertSchedulerHandler.py:159 - [AlertScheduler] Stopped the alert scheduler.

INFO 2018-09-10 23:55:35,740 main.py:155 - loglevel=logging.INFO

INFO 2018-09-10 23:55:35,742 Hardware.py:68 - Initializing host system information.

INFO 2018-09-10 23:55:35,746 Hardware.py:188 - Some mount points were ignored: /dev, /dev/shm, /run, /sys/fs/cgroup, /run/user/1000, /run/user/0

INFO 2018-09-10 23:55:35,762 Facter.py:202 - Directory: '/etc/resource_overrides' does not exist - it won't be used for gathering system resources.

INFO 2018-09-10 23:55:35,765 Hardware.py:73 - Host system information: {'kernel': 'Linux', 'domain': 'ec2.internal', 'physicalprocessorcount': 8, 'kernelrelease': '3.10.0-693.el7.x86_64', 'uptime_days': '0', 'memorytotal': 31962140, 'swapfree': '0.00 GB', 'memorysize': 31962140, 'osfamily': 'redhat', 'swapsize': '0.00 GB', 'processorcount': 8, 'netmask': '255.255.255.128', 'timezone': 'UTC', 'hardwareisa': 'x86_64', 'memoryfree': 31314048, 'operatingsystem': 'redhat', 'kernelmajversion': '3.10', 'kernelversion': '3.10.0', 'macaddress': '0A:97:71:30:53:26', 'operatingsystemrelease': '7.4', 'ipaddress': '10.40.145.105', 'hostname': 'ip-10-40-145-105', 'uptime_hours': '0', 'fqdn': 'ip-10-40-145-105.ec2.internal', 'id': 'root', 'architecture': 'x86_64', 'selinux': True, 'mounts': [{'available': '19599720', 'used': '1359492', 'percent': '7%', 'device': '/dev/nvme0n1p2', 'mountpoint': '/', 'type': 'xfs', 'size': '20959212'}, {'available': '927944', 'used': '2564', 'percent': '1%', 'device': '/dev/nvme3n1', 'mountpoint': '/db-repo', 'type': 'ext4', 'size': '999320'}, {'available': '24299724', 'used': '45080', 'percent': '1%', 'device': '/dev/nvme2n1', 'mountpoint': '/provenance-repo', 'type': 'ext4', 'size': '25671908'}, {'available': '48783816', 'used': '53272', 'percent': '1%', 'device': '/dev/nvme1n1', 'mountpoint': '/nifi-logs', 'type': 'ext4', 'size': '51474912'}, {'available': '97760160', 'used': '61464', 'percent': '1%', 'device': '/dev/nvme4n1', 'mountpoint': '/content-repo', 'type': 'ext4', 'size': '103080888'}, {'available': '48783816', 'used': '53272', 'percent': '1%', 'device': '/dev/nvme5n1', 'mountpoint': '/flowfile-repo', 'type': 'ext4', 'size': '51474912'}], 'hardwaremodel': 'x86_64', 'uptime_seconds': '1176', 'interfaces': 'eth0,lo'}

INFO 2018-09-10 23:55:35,767 DataCleaner.py:39 - Data cleanup thread started

INFO 2018-09-10 23:55:35,768 DataCleaner.py:120 - Data cleanup started

INFO 2018-09-10 23:55:35,768 DataCleaner.py:122 - Data cleanup finished

INFO 2018-09-10 23:55:35,798 hostname.py:67 - agent:hostname_script configuration not defined thus read hostname 'ip-10-40-145-105.ec2.internal' using socket.getfqdn().

INFO 2018-09-10 23:55:35,803 PingPortListener.py:50 - Ping port listener started on port: 8670

INFO 2018-09-10 23:55:35,805 main.py:481 - Connecting to Ambari server at https://ip-10-40-145-25.ec2.internal:8440 (10.40.145.25)

INFO 2018-09-10 23:55:35,806 NetUtil.py:61 - Connecting to https://ip-10-40-145-25.ec2.internal:8440/ca

", None)

("WARNING 2018-09-10 23:55:35,248 shell.py:822 - can not switch user for RUN_COMMAND.

WARNING 2018-09-10 23:55:35,352 shell.py:822 - can not switch user for RUN_COMMAND.

INFO 2018-09-10 23:55:35,456 main.py:311 - Agent not going to die gracefully, going to execute kill -9

WARNING 2018-09-10 23:55:35,456 shell.py:822 - can not switch user for RUN_COMMAND.

INFO 2018-09-10 23:55:35,460 main.py:322 - Agent stopped successfully by kill -9, exiting.

INFO 2018-09-10 23:55:35,460 ExitHelper.py:57 - Performing cleanup before exiting...

INFO 2018-09-10 23:55:35,461 AlertSchedulerHandler.py:159 - [AlertScheduler] Stopped the alert scheduler.

INFO 2018-09-10 23:55:35,461 AlertSchedulerHandler.py:159 - [AlertScheduler] Stopped the alert scheduler.

INFO 2018-09-10 23:55:35,740 main.py:155 - loglevel=logging.INFO

INFO 2018-09-10 23:55:35,742 Hardware.py:68 - Initializing host system information.

INFO 2018-09-10 23:55:35,746 Hardware.py:188 - Some mount points were ignored: /dev, /dev/shm, /run, /sys/fs/cgroup, /run/user/1000, /run/user/0

INFO 2018-09-10 23:55:35,762 Facter.py:202 - Directory: '/etc/resource_overrides' does not exist - it won't be used for gathering system resources.

INFO 2018-09-10 23:55:35,765 Hardware.py:73 - Host system information: {'kernel': 'Linux', 'domain': 'ec2.internal', 'physicalprocessorcount': 8, 'kernelrelease': '3.10.0-693.el7.x86_64', 'uptime_days': '0', 'memorytotal': 31962140, 'swapfree': '0.00 GB', 'memorysize': 31962140, 'osfamily': 'redhat', 'swapsize': '0.00 GB', 'processorcount': 8, 'netmask': '255.255.255.128', 'timezone': 'UTC', 'hardwareisa': 'x86_64', 'memoryfree': 31314048, 'operatingsystem': 'redhat', 'kernelmajversion': '3.10', 'kernelversion': '3.10.0', 'macaddress': '0A:97:71:30:53:26', 'operatingsystemrelease': '7.4', 'ipaddress': '10.40.145.105', 'hostname': 'ip-10-40-145-105', 'uptime_hours': '0', 'fqdn': 'ip-10-40-145-105.ec2.internal', 'id': 'root', 'architecture': 'x86_64', 'selinux': True, 'mounts': [{'available': '19599720', 'used': '1359492', 'percent': '7%', 'device': '/dev/nvme0n1p2', 'mountpoint': '/', 'type': 'xfs', 'size': '20959212'}, {'available': '927944', 'used': '2564', 'percent': '1%', 'device': '/dev/nvme3n1', 'mountpoint': '/db-repo', 'type': 'ext4', 'size': '999320'}, {'available': '24299724', 'used': '45080', 'percent': '1%', 'device': '/dev/nvme2n1', 'mountpoint': '/provenance-repo', 'type': 'ext4', 'size': '25671908'}, {'available': '48783816', 'used': '53272', 'percent': '1%', 'device': '/dev/nvme1n1', 'mountpoint': '/nifi-logs', 'type': 'ext4', 'size': '51474912'}, {'available': '97760160', 'used': '61464', 'percent': '1%', 'device': '/dev/nvme4n1', 'mountpoint': '/content-repo', 'type': 'ext4', 'size': '103080888'}, {'available': '48783816', 'used': '53272', 'percent': '1%', 'device': '/dev/nvme5n1', 'mountpoint': '/flowfile-repo', 'type': 'ext4', 'size': '51474912'}], 'hardwaremodel': 'x86_64', 'uptime_seconds': '1176', 'interfaces': 'eth0,lo'}

INFO 2018-09-10 23:55:35,767 DataCleaner.py:39 - Data cleanup thread started

INFO 2018-09-10 23:55:35,768 DataCleaner.py:120 - Data cleanup started

INFO 2018-09-10 23:55:35,768 DataCleaner.py:122 - Data cleanup finished

INFO 2018-09-10 23:55:35,798 hostname.py:67 - agent:hostname_script configuration not defined thus read hostname 'ip-10-40-145-105.ec2.internal' using socket.getfqdn().

INFO 2018-09-10 23:55:35,803 PingPortListener.py:50 - Ping port listener started on port: 8670

INFO 2018-09-10 23:55:35,805 main.py:481 - Connecting to Ambari server at https://ip-10-40-145-25.ec2.internal:8440 (10.40.145.25)

INFO 2018-09-10 23:55:35,806 NetUtil.py:61 - Connecting to https://ip-10-40-145-25.ec2.internal:8440/ca

", None)

Connection to ip-10-40-145-105.ec2.internal closed.

SSH command execution finished

host=ip-10-40-145-105.ec2.internal, exitcode=0

Command end time 2018-09-10 23:55:38

Registering with the server...

Registration with the server failed.

1 ACCEPTED SOLUTION

avatar

Hi @Alex M,

Referring to your error log it seems your agent registration is failing due to below error message :

("WARNING 2018-09-10 23:55:35,248 shell.py:822 - can not switch user for RUN_COMMAND.
WARNING 2018-09-10 23:55:35,352 shell.py:822 - can not switch user for RUN_COMMAND.
INFO 2018-09-10 23:55:35,456 main.py:311 - Agent not going to die gracefully, going to execute kill -9
WARNING 2018-09-10 23:55:35,456 shell.py:822 - can not switch user for RUN_COMMAND.

Are you installing ambari as non-root user , have you given proper permissions as neccessary ?

Also can you please try to install and register ambari-agent mannually : https://docs.hortonworks.com/HDPDocuments/Ambari-2.6.2.0/bk_ambari-administration/content/install_th...

and proceed using the add-host wizard.

Also i assume your node can do password less SSH to ip-10-40-145-25.ec2.internal which is your ambari-server ip. (even if its same host its required to have publickey added to authorized_keys )

cat id_rsa.pub >> authorized_keys

refer to : https://docs.hortonworks.com/HDPDocuments/Ambari-2.7.0.0/bk_ambari-installation/content/set_up_passw...

Please see if this helps you, please login and accept this answer if it did.

View solution in original post

3 REPLIES 3

avatar

Hi @Alex M,

Referring to your error log it seems your agent registration is failing due to below error message :

("WARNING 2018-09-10 23:55:35,248 shell.py:822 - can not switch user for RUN_COMMAND.
WARNING 2018-09-10 23:55:35,352 shell.py:822 - can not switch user for RUN_COMMAND.
INFO 2018-09-10 23:55:35,456 main.py:311 - Agent not going to die gracefully, going to execute kill -9
WARNING 2018-09-10 23:55:35,456 shell.py:822 - can not switch user for RUN_COMMAND.

Are you installing ambari as non-root user , have you given proper permissions as neccessary ?

Also can you please try to install and register ambari-agent mannually : https://docs.hortonworks.com/HDPDocuments/Ambari-2.6.2.0/bk_ambari-administration/content/install_th...

and proceed using the add-host wizard.

Also i assume your node can do password less SSH to ip-10-40-145-25.ec2.internal which is your ambari-server ip. (even if its same host its required to have publickey added to authorized_keys )

cat id_rsa.pub >> authorized_keys

refer to : https://docs.hortonworks.com/HDPDocuments/Ambari-2.7.0.0/bk_ambari-installation/content/set_up_passw...

Please see if this helps you, please login and accept this answer if it did.

avatar
Master Mentor

@Alex M

Have you tried running the command with sudo, are you running the installation like another user apart from root?

avatar
New Member
@Akhil S Naik

Thank you - running "cat id_rsa.pub >> authorized_keys" on Ambari Server did the trick