Created 04-03-2018 11:52 AM
ambari metrics monitor failed to start on one my 2 hosts. here is the error
stderr: Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/AMBARI_METRICS/0.1.0/package/scripts/metrics_monitor.py", line 58, in <module> AmsMonitor().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 219, in execute method(env) File "/var/lib/ambari-agent/cache/common-services/AMBARI_METRICS/0.1.0/package/scripts/metrics_monitor.py", line 28, in install self.install_packages(env, exclude_packages = ['ambari-metrics-collector', 'ambari-metrics-grafana']) File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 410, in install_packages retry_count=agent_stack_retry_count) File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 154, in __init__ self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run self.run_action(resource, action) File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action provider_action() File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py", line 54, in action_install self.install_package(package_name, self.resource.use_repos, self.resource.skip_repos) File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/yumrpm.py", line 49, in install_package self.checked_call_with_retries(cmd, sudo=True, logoutput=self.get_logoutput()) File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py", line 83, in checked_call_with_retries return self._call_with_retries(cmd, is_checked=True, **kwargs) File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py", line 91, in _call_with_retries code, out = func(cmd, **kwargs) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70, in inner result = function(command, **kwargs) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call tries=tries, try_sleep=try_sleep) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper result = _call(command, **kwargs_copy) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call raise Fail(err_msg) resource_management.core.exceptions.Fail: Execution of '/usr/bin/yum -d 0 -e 0 -y install ambari-metrics-monitor' returned 1. Error: Nothing to do stdout: 2018-04-03 05:48:24,805 - The hadoop conf dir /usr/hdp/current/hadoop-client/conf exists, will call conf-select on it for version 2.3.6.0-3796 2018-04-03 05:48:24,805 - Checking if need to create versioned conf dir /etc/hadoop/2.3.6.0-3796/0 2018-04-03 05:48:24,805 - call['conf-select create-conf-dir --package hadoop --stack-version 2.3.6.0-3796 --conf-version 0'] {'logoutput': False, 'sudo': True, 'quiet': False, 'stderr': -1} 2018-04-03 05:48:24,837 - call returned (1, '/etc/hadoop/2.3.6.0-3796/0 exist already', '') 2018-04-03 05:48:24,837 - checked_call['conf-select set-conf-dir --package hadoop --stack-version 2.3.6.0-3796 --conf-version 0'] {'logoutput': False, 'sudo': True, 'quiet': False} 2018-04-03 05:48:24,868 - checked_call returned (0, '') 2018-04-03 05:48:24,868 - Ensuring that hadoop has the correct symlink structure 2018-04-03 05:48:24,869 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf 2018-04-03 05:48:24,871 - Group['hadoop'] {} 2018-04-03 05:48:24,873 - Group['users'] {} 2018-04-03 05:48:24,873 - User['hive'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']} 2018-04-03 05:48:24,875 - User['zookeeper'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']} 2018-04-03 05:48:24,876 - User['ams'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']} 2018-04-03 05:48:24,877 - User['oozie'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['users']} 2018-04-03 05:48:24,878 - User['ambari-qa'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['users']} 2018-04-03 05:48:24,879 - User['tez'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['users']} 2018-04-03 05:48:24,880 - User['hdfs'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']} 2018-04-03 05:48:24,881 - User['sqoop'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']} 2018-04-03 05:48:24,882 - User['yarn'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']} 2018-04-03 05:48:24,883 - User['hcat'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']} 2018-04-03 05:48:24,884 - User['mapred'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']} 2018-04-03 05:48:24,885 - File['/var/lib/ambari-agent/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555} 2018-04-03 05:48:24,888 - Execute['/var/lib/ambari-agent/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa'] {'not_if': '(test $(id -u ambari-qa) -gt 1000) || (false)'} 2018-04-03 05:48:24,893 - Skipping Execute['/var/lib/ambari-agent/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa'] due to not_if 2018-04-03 05:48:24,894 - Group['hdfs'] {} 2018-04-03 05:48:24,894 - User['hdfs'] {'fetch_nonlocal_groups': True, 'groups': ['hadoop', 'hdfs']} 2018-04-03 05:48:24,895 - FS Type: 2018-04-03 05:48:24,895 - Directory['/etc/hadoop'] {'mode': 0755} 2018-04-03 05:48:24,919 - File['/usr/hdp/current/hadoop-client/conf/hadoop-env.sh'] {'content': InlineTemplate(...), 'owner': 'hdfs', 'group': 'hadoop'} 2018-04-03 05:48:24,920 - Directory['/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir'] {'owner': 'hdfs', 'group': 'hadoop', 'mode': 0777} 2018-04-03 05:48:24,937 - Repository['HDP-2.3'] {'base_url': 'http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.3.6.0', 'action': ['create'], 'components': ['HDP', 'main'], 'repo_template': '[{{repo_id}}]\nname={{repo_id}}\n{% if mirror_list %}mirrorlist={{mirror_list}}{% else %}baseurl={{base_url}}{% endif %}\n\npath=/\nenabled=1\ngpgcheck=0', 'repo_file_name': 'HDP', 'mirror_list': None} 2018-04-03 05:48:24,949 - File['/etc/yum.repos.d/HDP.repo'] {'content': '[HDP-2.3]\nname=HDP-2.3\nbaseurl=http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.3.6.0\n\npath=/\nenabled=1\ngpgcheck=0'} 2018-04-03 05:48:24,950 - Repository['HDP-UTILS-1.1.0.20'] {'base_url': 'http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.20/repos/centos6', 'action': ['create'], 'components': ['HDP-UTILS', 'main'], 'repo_template': '[{{repo_id}}]\nname={{repo_id}}\n{% if mirror_list %}mirrorlist={{mirror_list}}{% else %}baseurl={{base_url}}{% endif %}\n\npath=/\nenabled=1\ngpgcheck=0', 'repo_file_name': 'HDP-UTILS', 'mirror_list': None} 2018-04-03 05:48:24,955 - File['/etc/yum.repos.d/HDP-UTILS.repo'] {'content': '[HDP-UTILS-1.1.0.20]\nname=HDP-UTILS-1.1.0.20\nbaseurl=http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.20/repos/centos6\n\npath=/\nenabled=1\ngpgcheck=0'} 2018-04-03 05:48:24,956 - Package['unzip'] {'retry_on_repo_unavailability': False, 'retry_count': 5} 2018-04-03 05:48:25,089 - Skipping installation of existing package unzip 2018-04-03 05:48:25,089 - Package['curl'] {'retry_on_repo_unavailability': False, 'retry_count': 5} 2018-04-03 05:48:25,109 - Skipping installation of existing package curl 2018-04-03 05:48:25,110 - Package['hdp-select'] {'retry_on_repo_unavailability': False, 'retry_count': 5} 2018-04-03 05:48:25,130 - Skipping installation of existing package hdp-select 2018-04-03 05:48:25,355 - Package['ambari-metrics-monitor'] {'retry_on_repo_unavailability': False, 'retry_count': 5} 2018-04-03 05:48:25,488 - Installing package ambari-metrics-monitor ('/usr/bin/yum -d 0 -e 0 -y install ambari-metrics-monitor')
Created 04-05-2018 07:48 AM
The problem is this: (Ambari Version and the Ambari Metrics Component version should be same)
# rpm -qa | grep ambari ambari-metrics-collector-2.1.0-1470.x86_64 ambari-metrics-monitor-2.1.0-1470.x86_64 ambari-server-2.2.2.0-460.x86_64 ambari-metrics-hadoop-sink-2.1.0-1470.x86_64 ambari-agent-2.2.2.0-460.x86_64
.
Looks like you have not performed the Amabri Post upgrade steps hence your AMS binaries are still Old (2.1.0). where as ambari binaries are 2.2.2
1. Stop the AMS collector Service from ambari UI and then perform the AMS post upgrade steps: https://docs.hortonworks.com/HDPDocuments/Ambari-2.2.2.0/bk_upgrading_Ambari/content/_upgrade_ambari...
Please verify that you have the correct ambari.repo: (repo should be from 2.2.2 version NOT from 2.1.0)
# cat /etc/yum.repos.d/ambari.repo | grep 2.2.2
2. So please do this on all the hosts where you see AMS binary version as 2.1.0
# yum clean all # yum upgrade ambari-metrics-monitor ambari-metrics-hadoop-sink
.
3. And on the host where Ambari Metrics Collector is installed Please do this:
# yum upgrade ambari-metrics-collector
.
Created 04-03-2018 11:59 AM
The following error indicates that somehow the yum installation is not happening:
Fail(err_msg) resource_management.core.exceptions.Fail: Execution of '/usr/bin/yum -d 0 -e 0 -y install ambari-metrics-monitor' returned 1. Error: Nothing to do stdout:
.
So can you please try to do the manual installation of monitor Binaries on that host as following and then continue from Ambari Server UI.
# yum clean all # yum remove ambari-metrics-monitor # yum install ambari-metrics-monitor
.
Sometimes incomplete transaction at yum level causes such issue so better to remove the incomplete binary installation first and then try installing it again.
Also please check if the "/etc/yum.repos.d/ambari.repo" file on that host is pointing to the correct repo URL and that URL is accessible from the host ?
# cat /etc/yum.repos.d/ambari.repo # cat /etc/yum.repos.d/ambari.repo | grep baseurl
.
Created 04-03-2018 12:03 PM
Hi Jay , Thanks for responding , I tried the commands and it showed
# yum remove ambari-metrics-monitor
Loaded plugins: fastestmirror, refresh-packagekit, security Setting up Remove Process No Match for argument: ambari-metrics-monitor Loading mirror speeds from cached hostfile * base: mirror.den1.denvercolo.net * extras: mirror.raystedman.net * updates: mirror.compevo.com No Packages marked for removal
it seems the packages are unavailable
Created 04-03-2018 12:12 PM
AMS monitor should come from Ambari repo (not from HDP repo) so please check if you have the amabri.repo file present on that host or not?
# cat /etc/yum.repos.d/ambari.repo
.
If not then please copy and paste that repo file from ambari-server host to this host.
Also please make sure that the host where the AMS monitor installation is failing does not have any N/W issue. THis is to ensure that the yum installation for AMS monitor happens without any issue.
Created 04-03-2018 01:16 PM
hi Jay , SO the repo file was missing so I copied that file on the host and the metrics monitor started, But it still doesnt show any data on the metrics GUI . The square boxes prompt "no data available"
Created 04-03-2018 01:33 PM
Good to know that now at lease the AMS monitor is coming up.
Regarding the "No data available" issue please share the "/var/log/ambari-metrics-monitor/ambari-metrics-monitor.out"
Also please check if the "/etc/ambari-metrics-monitor/conf/ambari-metrics-monitor.ini" file is pointing to the Correct Hostname of AMS Collector?
Created 04-03-2018 01:37 PM
HI so the ambari-metrics-monitor.ini file seem to be pointing at the right host
here is the log for ambari-metrics-monitor.out
2018-04-03 07:10:43,905 [INFO] host_info.py:294 - hostname_script: None 2018-04-03 07:10:43,970 [INFO] host_info.py:306 - Cached hostname: itxcqchdp01.catmdev.com 2018-04-03 07:10:43,970 [INFO] controller.py:102 - Adding event to cache, all : {u'metrics': [{u'value_threshold': u'128', u'name': u'bytes_out'}], u'collect_every': u'10'} 2018-04-03 07:10:43,970 [INFO] controller.py:110 - Adding event to cache, : {u'metrics': [], u'collect_every': u'15'} 2018-04-03 07:10:43,970 [INFO] main.py:65 - Starting Server RPC Thread: /usr/lib/python2.6/site-packages/resource_monitoring/main.py start 2018-04-03 07:10:43,971 [INFO] controller.py:57 - Running Controller thread: Thread-1 2018-04-03 07:10:43,971 [INFO] emitter.py:45 - Running Emitter thread: Thread-2 2018-04-03 07:10:43,971 [INFO] emitter.py:65 - Nothing to emit, resume waiting. 2018-04-03 07:11:43,973 [WARNING] emitter.py:74 - Error sending metrics to server. 'NoneType' object has no attribute 'strip' 2018-04-03 07:11:43,973 [WARNING] emitter.py:80 - Retrying after 5 ... 2018-04-03 07:11:48,974 [WARNING] emitter.py:74 - Error sending metrics to server. 'NoneType' object has no attribute 'strip' 2018-04-03 07:11:48,974 [WARNING] emitter.py:80 - Retrying after 5 ...
Created 04-04-2018 10:47 AM
Hi jay , The delete option for ams monitor is disabled in the ambari UI . How should I go about deleting it and reinstalling it again?
Created 04-03-2018 01:47 PM
The following error might occur if you have installed a slightly different version of Ambari Metrics Monitor compared to the ambari version and hence there are some library mismatch.
[WARNING] emitter.py:74 - Error sending metrics to server. 'NoneType' object has no attribute 'strip' 2
.
Please check the output of the following command and verify if all the ambari component versions are same?
# rpm -qa | grep ambari
.
Better to freshly try installing the ams monitor. From ambari UI just remove the metrics monitor process and then try installing it again.
Created 04-03-2018 01:55 PM
I tried the commannd you mentioned and it returned following value
ambari-metrics-collector-2.1.0-1470.x86_64
ambari-metrics-monitor-2.1.0-1470.x86_64
ambari-server-2.2.2.0-460.x86_64
ambari-metrics-hadoop-sink-2.1.0-1470.x86_64
ambari-agent-2.2.2.0-460.x86_64
And if i had to remove the process and try installing again, How would I go about doing it?