- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Ambari agent registration of HDF cluster fails inspite of exitcode 0. Setup of RHEL on MS Azure.
- Labels:
-
Apache Ambari
-
Cloudera DataFlow (CDF)
Created 05-16-2018 12:59 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
==========================
Creating target directory... ========================== Command start time 2018-05-16 06:08:52 chmod: cannot access ‘/var/lib/ambari-agent/data’: No such file or directory Warning: Permanently added 'mtvm6.eastus.cloudapp.azure.com,40.117.251.23' (ECDSA) to the list of known hosts. Connection to mtvm6.eastus.cloudapp.azure.com closed. SSH command execution finished host=mtvm6.eastus.cloudapp.azure.com, exitcode=0 Command end time 2018-05-16 06:08:52 ========================== Copying ambari sudo script... ========================== Command start time 2018-05-16 06:08:52 scp /var/lib/ambari-server/ambari-sudo.sh host=mtvm6.eastus.cloudapp.azure.com, exitcode=0 Command end time 2018-05-16 06:08:53 ========================== Copying common functions script... ========================== Command start time 2018-05-16 06:08:53 scp /usr/lib/python2.6/site-packages/ambari_commons host=mtvm6.eastus.cloudapp.azure.com, exitcode=0 Command end time 2018-05-16 06:08:53 ========================== Copying create-python-wrap script... ========================== Command start time 2018-05-16 06:08:53 scp /var/lib/ambari-server/create-python-wrap.sh host=mtvm6.eastus.cloudapp.azure.com, exitcode=0 Command end time 2018-05-16 06:08:54 ========================== Copying OS type check script... ========================== Command start time 2018-05-16 06:08:54 scp /usr/lib/python2.6/site-packages/ambari_server/os_check_type.py host=mtvm6.eastus.cloudapp.azure.com, exitcode=0 Command end time 2018-05-16 06:08:54 ========================== Running create-python-wrap script... ========================== Command start time 2018-05-16 06:08:54 Connection to mtvm6.eastus.cloudapp.azure.com closed. SSH command execution finished host=mtvm6.eastus.cloudapp.azure.com, exitcode=0 Command end time 2018-05-16 06:08:55 ========================== Running OS type check... ========================== Command start time 2018-05-16 06:08:55 Cluster primary/cluster OS family is redhat7 and local/current OS family is redhat7 Connection to mtvm6.eastus.cloudapp.azure.com closed. SSH command execution finished host=mtvm6.eastus.cloudapp.azure.com, exitcode=0 Command end time 2018-05-16 06:08:55 ========================== Checking 'sudo' package on remote host... ========================== Command start time 2018-05-16 06:08:55 sudo-1.8.19p2-11.el7_4.x86_64 Connection to mtvm6.eastus.cloudapp.azure.com closed. SSH command execution finished host=mtvm6.eastus.cloudapp.azure.com, exitcode=0 Command end time 2018-05-16 06:08:56 ========================== Copying repo file to 'tmp' folder... ========================== Command start time 2018-05-16 06:08:56 scp /etc/yum.repos.d/ambari.repo host=mtvm6.eastus.cloudapp.azure.com, exitcode=0 Command end time 2018-05-16 06:08:57 ========================== Moving file to repo dir... ========================== Command start time 2018-05-16 06:08:57 Connection to mtvm6.eastus.cloudapp.azure.com closed. SSH command execution finished host=mtvm6.eastus.cloudapp.azure.com, exitcode=0 Command end time 2018-05-16 06:08:57 ========================== Changing permissions for ambari.repo... ========================== Command start time 2018-05-16 06:08:57 Connection to mtvm6.eastus.cloudapp.azure.com closed. SSH command execution finished host=mtvm6.eastus.cloudapp.azure.com, exitcode=0 Command end time 2018-05-16 06:08:57 ========================== Copying setup script file... ========================== Command start time 2018-05-16 06:08:57 scp /usr/lib/python2.6/site-packages/ambari_server/setupAgent.py host=mtvm6.eastus.cloudapp.azure.com, exitcode=0 Command end time 2018-05-16 06:08:58 ========================== Running setup agent script... ========================== Command start time 2018-05-16 06:08:58 ("INFO 2018-05-16 06:09:18,024 main.py:145 - loglevel=logging.INFO INFO 2018-05-16 06:09:18,024 main.py:145 - loglevel=logging.INFO INFO 2018-05-16 06:09:18,024 main.py:145 - loglevel=logging.INFO INFO 2018-05-16 06:09:18,025 DataCleaner.py:39 - Data cleanup thread started INFO 2018-05-16 06:09:18,027 DataCleaner.py:120 - Data cleanup started INFO 2018-05-16 06:09:18,027 DataCleaner.py:122 - Data cleanup finished INFO 2018-05-16 06:09:18,028 hostname.py:67 - agent:hostname_script configuration not defined thus read hostname 'mtvm6.eastus.cloudapp.azure.com' using socket.getfqdn(). INFO 2018-05-16 06:09:18,035 PingPortListener.py:50 - Ping port listener started on port: 8670 INFO 2018-05-16 06:09:18,038 main.py:437 - Connecting to Ambari server at https://myhdf.eastus.cloudapp.azure.com:8440 (104.211.60.99) INFO 2018-05-16 06:09:18,038 NetUtil.py:70 - Connecting to https://myhdf.eastus.cloudapp.azure.com:8440/ca ", None) ("INFO 2018-05-16 06:09:18,024 main.py:145 - loglevel=logging.INFO INFO 2018-05-16 06:09:18,024 main.py:145 - loglevel=logging.INFO INFO 2018-05-16 06:09:18,024 main.py:145 - loglevel=logging.INFO INFO 2018-05-16 06:09:18,025 DataCleaner.py:39 - Data cleanup thread started INFO 2018-05-16 06:09:18,027 DataCleaner.py:120 - Data cleanup started INFO 2018-05-16 06:09:18,027 DataCleaner.py:122 - Data cleanup finished INFO 2018-05-16 06:09:18,028 hostname.py:67 - agent:hostname_script configuration not defined thus read hostname 'mtvm6.eastus.cloudapp.azure.com' using socket.getfqdn(). INFO 2018-05-16 06:09:18,035 PingPortListener.py:50 - Ping port listener started on port: 8670 INFO 2018-05-16 06:09:18,038 main.py:437 - Connecting to Ambari server at https://myhdf.eastus.cloudapp.azure.com:8440 (104.211.60.99) INFO 2018-05-16 06:09:18,038 NetUtil.py:70 - Connecting to https://myhdf.eastus.cloudapp.azure.com:8440/ca ", None) Connection to mtvm6.eastus.cloudapp.azure.com closed. SSH command execution finished host=mtvm6.eastus.cloudapp.azure.com, exitcode=0 Command end time 2018-05-16 06:09:20 Registering with the server... Registration with the server failed.
Created 05-19-2018 08:13 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am happy you have succeeded. Next time you can now help someone with the setup of HDF in Azure 🙂
Yes, the database could be set on any node but as you have already Postgres installed for Ambari it's easier to have the other databases on the same host for easier management.
CAUTION:
When in production think of setting database replication in the future.
Once you have finished the setup If you found this answer addressed your question, please take a moment to log in and click the "Accept" link on the answer.
Keep me posted
Created 05-17-2018 12:45 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
/etc/hosts looks alright on all nodes
host -v -t A mtvm6.eastus.cloudapp.azure.com yields mtvm6.eastus.cloudapp.azure.com. 10 IN A 40.117.251.23 etc.
so to me this looks ok ...
Created 05-17-2018 01:22 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Back to the basics, did you validate the below points?
- Set Up Password-less SSH
- Enable NTP on the Cluster and on the Browser Host
- Check DNS and NSCD
- Configuring iptables
- Disable SELinux and PackageKit and check the umask Value
And check the DNS resolution, The manual registration was successful on the Ambari host but failed on the 2 other nodes because it just can't locate them.
Can you upload the ambari-server.log ?
Created 05-17-2018 02:23 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Geoffey, all done. How can I best upload the amber-server.log from the Azure service ? It is a huge file
Created 05-17-2018 03:50 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Can you get it to your local machine and maybe trim the last 200 or so lines or use an external website and provide the link to download because in here the format are restricted and the sizes too
Created 05-18-2018 02:14 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Geoffrey, thanks so much. Can you please share your email address? So I will share the link to Google Drive with Ambari log files. Thanks, Matthias
Created 05-18-2018 02:49 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created 05-18-2018 05:24 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That's quite bizarre in the logs you attached I see that Ambari is trying to load 2 different repos see below can you explain why?
Centos6
Could not load version definition for HDP-2.6 identified by http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.6.4.0/HDP-2.6.4.0-91.xml. null
Centos7
Could not load version definition for HDP-2.6 identified by http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.6.5.0/HDP-2.6.5.0-292.xml. null
Check the current repos, you should see ambari,hdp and hdp-utils repos
$ ll /etc/yum.repos.d/
Validate the contents and share their contents
- cat /etc/yum.repos.d/HDP.repo
- cat /etc/yum.repos.d/ambari.repo
- cat /etc/yum.repos.d/HDP-UTILS.repo
Check the OS version in my case its a Centos6 so you can grab the correct repo
cat /etc/redhat-release CentOS release 6.9 (Final)
Delete the HDP and HDP-Utils repos in /etc/yum.repos.d
rm -rf HDP*
Clean the repos
# yum clean all
Validate, you shouldn't see HDP and HDP-Utils except for ambari.repo and some Centos stuff
# yum repolist
For Ambari version 2.5 and above see the choice(attached ambari-hdp-matrix.jpg)
Download the correct HDP repos
See OS version above in my case Centos 6, the hdp in HDP 2.6 delivers both HDP and HDP-Utils
$ wget -nv http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.6.4.0/hdp.repo -O /tmp/hdp.repo
GPL repo
$ wget -nv http://public-repo-1.hortonworks.com/HDP-GPL/centos6/2.x/updates/2.6.4.0/hdp.gpl.repo -O /tmp/hdp.gpl.repo
After the above revalidate, you should be able to see new HDP*/ repos
# yum repolist
Now restart the cluster deployment
Created 05-18-2018 06:12 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Geoffrey, I am running Red Hat RHEL 7. So don't you think I should opt for the CentOS 7 repos ?
Created 05-18-2018 06:27 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes definitely that's why I wanted you to match my output to your OS version!
The was also this error what is your processor?
"cannot resolve OS centos7-ppc to the supported ones: suse12,suse11,redhat7,debian7,redhat6,ubuntu14,ubuntu12. Family: null"
Please revert
Created 05-18-2018 06:41 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
here is a list of the files in yum.repos.d. Why remove all HDP repos ?? I suggest to eliminate the 2 public-repo-1... repos and leave the hdp.repo (see RHEL7 content below).
-rw-r--r--. 1 root root 306 May 31 2017 ambari.repo
-rw-r--r--. 1 root root 574 Jan 8 06:49 hdp.repo
-rw-r--r--. 1 root root 296 May 17 06:36 public-repo-1.hortonworks.com_HDP_centos7_2.x_updates_2.6.4.0_HDP-2.6.4.0-91.xml.repo
-rw-r--r--. 1 root root 242 May 17 06:09 public-repo-1.hortonworks.com_HDP-UTILS-1.1.0.22_repos_centos7.repo
-rw-r--r--. 1 root root 358 Jan 5 00:08 redhat.repo
-rw-r--r--. 1 root root 13955 Dec 3 2016 rh-cloud.repo
Here is the content of the hdp.repo : it looks alright to me ?
#VERSION_NUMBER=2.6.4.0-91
[HDP-2.6.4.0]
name=HDP Version - HDP-2.6.4.0
baseurl=http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.6.4.0
gpgcheck=1
gpgkey=http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.6.4.0/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1
[HDP-UTILS-1.1.0.22]
name=HDP-UTILS Version - HDP-UTILS-1.1.0.22
baseurl=http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.22/repos/centos7
gpgcheck=1
gpgkey=http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.6.4.0/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1