Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Ambari agent registration of HDF cluster fails inspite of exitcode 0. Setup of RHEL on MS Azure.

avatar
Explorer

==========================

Creating target directory...
==========================

Command start time 2018-05-16 06:08:52
chmod: cannot access ‘/var/lib/ambari-agent/data’: No such file or directory

Warning: Permanently added 'mtvm6.eastus.cloudapp.azure.com,40.117.251.23' (ECDSA) to the list of known hosts.
Connection to mtvm6.eastus.cloudapp.azure.com closed.
SSH command execution finished
host=mtvm6.eastus.cloudapp.azure.com, exitcode=0
Command end time 2018-05-16 06:08:52

==========================
Copying ambari sudo script...
==========================

Command start time 2018-05-16 06:08:52

scp /var/lib/ambari-server/ambari-sudo.sh
host=mtvm6.eastus.cloudapp.azure.com, exitcode=0
Command end time 2018-05-16 06:08:53

==========================
Copying common functions script...
==========================

Command start time 2018-05-16 06:08:53

scp /usr/lib/python2.6/site-packages/ambari_commons
host=mtvm6.eastus.cloudapp.azure.com, exitcode=0
Command end time 2018-05-16 06:08:53

==========================
Copying create-python-wrap script...
==========================

Command start time 2018-05-16 06:08:53

scp /var/lib/ambari-server/create-python-wrap.sh
host=mtvm6.eastus.cloudapp.azure.com, exitcode=0
Command end time 2018-05-16 06:08:54

==========================
Copying OS type check script...
==========================

Command start time 2018-05-16 06:08:54

scp /usr/lib/python2.6/site-packages/ambari_server/os_check_type.py
host=mtvm6.eastus.cloudapp.azure.com, exitcode=0
Command end time 2018-05-16 06:08:54

==========================
Running create-python-wrap script...
==========================

Command start time 2018-05-16 06:08:54

Connection to mtvm6.eastus.cloudapp.azure.com closed.
SSH command execution finished
host=mtvm6.eastus.cloudapp.azure.com, exitcode=0
Command end time 2018-05-16 06:08:55

==========================
Running OS type check...
==========================

Command start time 2018-05-16 06:08:55
Cluster primary/cluster OS family is redhat7 and local/current OS family is redhat7

Connection to mtvm6.eastus.cloudapp.azure.com closed.
SSH command execution finished
host=mtvm6.eastus.cloudapp.azure.com, exitcode=0
Command end time 2018-05-16 06:08:55

==========================
Checking 'sudo' package on remote host...
==========================

Command start time 2018-05-16 06:08:55
sudo-1.8.19p2-11.el7_4.x86_64

Connection to mtvm6.eastus.cloudapp.azure.com closed.
SSH command execution finished
host=mtvm6.eastus.cloudapp.azure.com, exitcode=0
Command end time 2018-05-16 06:08:56

==========================
Copying repo file to 'tmp' folder...
==========================

Command start time 2018-05-16 06:08:56

scp /etc/yum.repos.d/ambari.repo
host=mtvm6.eastus.cloudapp.azure.com, exitcode=0
Command end time 2018-05-16 06:08:57

==========================
Moving file to repo dir...
==========================

Command start time 2018-05-16 06:08:57

Connection to mtvm6.eastus.cloudapp.azure.com closed.
SSH command execution finished
host=mtvm6.eastus.cloudapp.azure.com, exitcode=0
Command end time 2018-05-16 06:08:57

==========================
Changing permissions for ambari.repo...
==========================

Command start time 2018-05-16 06:08:57

Connection to mtvm6.eastus.cloudapp.azure.com closed.
SSH command execution finished
host=mtvm6.eastus.cloudapp.azure.com, exitcode=0
Command end time 2018-05-16 06:08:57

==========================
Copying setup script file...
==========================

Command start time 2018-05-16 06:08:57

scp /usr/lib/python2.6/site-packages/ambari_server/setupAgent.py
host=mtvm6.eastus.cloudapp.azure.com, exitcode=0
Command end time 2018-05-16 06:08:58

==========================
Running setup agent script...
==========================

Command start time 2018-05-16 06:08:58
("INFO 2018-05-16 06:09:18,024 main.py:145 - loglevel=logging.INFO
INFO 2018-05-16 06:09:18,024 main.py:145 - loglevel=logging.INFO
INFO 2018-05-16 06:09:18,024 main.py:145 - loglevel=logging.INFO
INFO 2018-05-16 06:09:18,025 DataCleaner.py:39 - Data cleanup thread started
INFO 2018-05-16 06:09:18,027 DataCleaner.py:120 - Data cleanup started
INFO 2018-05-16 06:09:18,027 DataCleaner.py:122 - Data cleanup finished
INFO 2018-05-16 06:09:18,028 hostname.py:67 - agent:hostname_script configuration not defined thus read hostname 'mtvm6.eastus.cloudapp.azure.com' using socket.getfqdn().
INFO 2018-05-16 06:09:18,035 PingPortListener.py:50 - Ping port listener started on port: 8670
INFO 2018-05-16 06:09:18,038 main.py:437 - Connecting to Ambari server at https://myhdf.eastus.cloudapp.azure.com:8440 (104.211.60.99)
INFO 2018-05-16 06:09:18,038 NetUtil.py:70 - Connecting to https://myhdf.eastus.cloudapp.azure.com:8440/ca
", None)
("INFO 2018-05-16 06:09:18,024 main.py:145 - loglevel=logging.INFO
INFO 2018-05-16 06:09:18,024 main.py:145 - loglevel=logging.INFO
INFO 2018-05-16 06:09:18,024 main.py:145 - loglevel=logging.INFO
INFO 2018-05-16 06:09:18,025 DataCleaner.py:39 - Data cleanup thread started
INFO 2018-05-16 06:09:18,027 DataCleaner.py:120 - Data cleanup started
INFO 2018-05-16 06:09:18,027 DataCleaner.py:122 - Data cleanup finished
INFO 2018-05-16 06:09:18,028 hostname.py:67 - agent:hostname_script configuration not defined thus read hostname 'mtvm6.eastus.cloudapp.azure.com' using socket.getfqdn().
INFO 2018-05-16 06:09:18,035 PingPortListener.py:50 - Ping port listener started on port: 8670
INFO 2018-05-16 06:09:18,038 main.py:437 - Connecting to Ambari server at https://myhdf.eastus.cloudapp.azure.com:8440 (104.211.60.99)
INFO 2018-05-16 06:09:18,038 NetUtil.py:70 - Connecting to https://myhdf.eastus.cloudapp.azure.com:8440/ca
", None)

Connection to mtvm6.eastus.cloudapp.azure.com closed.
SSH command execution finished
host=mtvm6.eastus.cloudapp.azure.com, exitcode=0
Command end time 2018-05-16 06:09:20

Registering with the server...
Registration with the server failed.
1 ACCEPTED SOLUTION

avatar
Master Mentor

@Matthias Tewordt

I am happy you have succeeded. Next time you can now help someone with the setup of HDF in Azure 🙂
Yes, the database could be set on any node but as you have already Postgres installed for Ambari it's easier to have the other databases on the same host for easier management.

CAUTION:

When in production think of setting database replication in the future.

Once you have finished the setup If you found this answer addressed your question, please take a moment to log in and click the "Accept" link on the answer.

Keep me posted

View solution in original post

50 REPLIES 50

avatar
Explorer

Regarding the processor: it is a DS1 v2 (1 vCPU 3.5GB): Dv2-Series instances are based on the latest generation 2.4 GHz Intel Xeon® E5-2673 v3 (Haswell) processor, and with Intel Turbo Boost Technology 2.0 can go to 3.2 GHz. Dv2-Series and D-Series are ideal for applications that demand faster CPUs, better local disk performance, or higher memories, and offer a powerful combination for many enterprise-grade applications.

avatar
Explorer

and finally:

cat /etc/redhat-release

Red Hat Enterprise Linux Server release 7.4 (Maipo)

avatar
Master Mentor

@Matthias Tewordt

I don't see the hdp.gpl.repo in the output? It's used for LZO compression libraries. See this HCC article

In my previous example I used as an example downloading to /tmp/hdp.repo you should replace it with /etc/yum.repos.d/

$ wget -nv http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.6.4.0/hdp.repo -O /etc/yum.repos.d//hdp.repo 

I also saw in the logs the below entries any ideas? Else mv them to some directory like /tmp

public-repo-1.hortonworks.com_HDP_centos7_2.x_updates_2.6.4.0_HDP-2.6.4.0-91.xml.repo 
public-repo-1.hortonworks.com_HDP-UTILS-1.1.0.22_repos_centos7.repo 

Then run in sequence

# yum clean all 
# yum repolist 

You should see the repos, and from here you can proceed.

Have you resolved the ssh from the ambari server to the other 2 nodes? Please correct your DNS before attempting

Example for mtvm6.eastus.cloudapp.azure.com

# nslookup 40.117.251.23

I assume you have the 3 nodes in all the /etc/hosts

avatar
Explorer

ok, reposts look good

yum repolist Loaded plugins: langpacks, product-id, search-disabled-repos

HDP-2.6.5.0 | 2.9 kB 00:00:00

HDP-UTILS-1.1.0.22 | 2.9 kB 00:00:00

ambari-2.5.1.0 | 2.9 kB 00:00:00

rhui-microsoft-azure-rhel7 | 2.9 kB 00:00:00

rhui-rhel-7-server-dotnet-rhui-debug-rpms | 3.8 kB 00:00:00

rhui-rhel-7-server-dotnet-rhui-rpms | 4.0 kB 00:00:00

rhui-rhel-7-server-do ......

avatar
Explorer

Have you resolved the ssh from the ambari server to the other 2 nodes? Please correct your DNS before attempting

using Azure VMs, ssh from the ambari server to the other 2 nodes works fine and has always worked

avatar
Explorer

ok, here we have the problem: not sure what this means ...

[root@myhdf ~]# nslookup 40.117.251.23

Server: 168.63.129.16

Address: 168.63.129.16#53

** server can't find 23.251.117.40.in-addr.arpa.: NXDOMAIN

avatar
Master Mentor

@Matthias Tewordt

I still don't see the gpl repo, run this on the Ambari server and repeat the yum repolist to validate

wget -nv http://public-repo-1.hortonworks.com/HDP-GPL/centos6/2.x/updates/2.6.4.0/hdp.gpl.repo -O /etc/yum.repos.d/hdp.gpl.repo


Forward lookup TEST

# nslookup mtvm6.eastus.cloudapp.azure.com 

Reverse lookup TEST error

# nslookup 40.117.251.23 

Can you check your /etc/named.conf file disable by setting it to no

empty-zones-enable no;

Please revert

avatar
Explorer

HDP-GPL now included

HDP-2.6.5.0 HDP Version - HDP-2.6.5.0 238

HDP-GPL-2.6.4.0 HDP-GPL Version - HDP-GPL-2.6.4.0 4

HDP-UTILS-1.1.0.22 HDP-UTILS Version - HDP-UTILS-1.1.0.22 16

ambari-2.5.1.0 ambari Version - ambari-2.5.1.0

avatar
Explorer

Forward lookup works

[root@myhdf ~]# nslookup mtvm6.eastus.cloudapp.azure.com

Server: 168.63.129.16

Address: 168.63.129.16#53

Non-authoritative answer: Name: mtvm6.eastus.cloudapp.azure.com Address: 40.117.251.23

avatar
Master Mentor

@Matthias Tewordt

Are you using private and public IP's in the /etc/hosts? For inter-cluster communication you need to use private,

Whats the value of empty-zones-enable in /etc/named.conf