Support Questions
Find answers, ask questions, and share your expertise

ambari monitor failed to start

ambari metrics monitor failed to start on one my 2 hosts. here is the error

stderr: Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/AMBARI_METRICS/0.1.0/package/scripts/metrics_monitor.py", line 58, in <module> AmsMonitor().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 219, in execute method(env) File "/var/lib/ambari-agent/cache/common-services/AMBARI_METRICS/0.1.0/package/scripts/metrics_monitor.py", line 28, in install self.install_packages(env, exclude_packages = ['ambari-metrics-collector', 'ambari-metrics-grafana']) File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 410, in install_packages retry_count=agent_stack_retry_count) File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 154, in __init__ self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run self.run_action(resource, action) File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action provider_action() File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py", line 54, in action_install self.install_package(package_name, self.resource.use_repos, self.resource.skip_repos) File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/yumrpm.py", line 49, in install_package self.checked_call_with_retries(cmd, sudo=True, logoutput=self.get_logoutput()) File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py", line 83, in checked_call_with_retries return self._call_with_retries(cmd, is_checked=True, **kwargs) File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py", line 91, in _call_with_retries code, out = func(cmd, **kwargs) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70, in inner result = function(command, **kwargs) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call tries=tries, try_sleep=try_sleep) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper result = _call(command, **kwargs_copy) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call raise Fail(err_msg) resource_management.core.exceptions.Fail: Execution of '/usr/bin/yum -d 0 -e 0 -y install ambari-metrics-monitor' returned 1. Error: Nothing to do stdout: 2018-04-03 05:48:24,805 - The hadoop conf dir /usr/hdp/current/hadoop-client/conf exists, will call conf-select on it for version 2.3.6.0-3796 2018-04-03 05:48:24,805 - Checking if need to create versioned conf dir /etc/hadoop/2.3.6.0-3796/0 2018-04-03 05:48:24,805 - call['conf-select create-conf-dir --package hadoop --stack-version 2.3.6.0-3796 --conf-version 0'] {'logoutput': False, 'sudo': True, 'quiet': False, 'stderr': -1} 2018-04-03 05:48:24,837 - call returned (1, '/etc/hadoop/2.3.6.0-3796/0 exist already', '') 2018-04-03 05:48:24,837 - checked_call['conf-select set-conf-dir --package hadoop --stack-version 2.3.6.0-3796 --conf-version 0'] {'logoutput': False, 'sudo': True, 'quiet': False} 2018-04-03 05:48:24,868 - checked_call returned (0, '') 2018-04-03 05:48:24,868 - Ensuring that hadoop has the correct symlink structure 2018-04-03 05:48:24,869 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf 2018-04-03 05:48:24,871 - Group['hadoop'] {} 2018-04-03 05:48:24,873 - Group['users'] {} 2018-04-03 05:48:24,873 - User['hive'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']} 2018-04-03 05:48:24,875 - User['zookeeper'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']} 2018-04-03 05:48:24,876 - User['ams'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']} 2018-04-03 05:48:24,877 - User['oozie'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['users']} 2018-04-03 05:48:24,878 - User['ambari-qa'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['users']} 2018-04-03 05:48:24,879 - User['tez'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['users']} 2018-04-03 05:48:24,880 - User['hdfs'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']} 2018-04-03 05:48:24,881 - User['sqoop'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']} 2018-04-03 05:48:24,882 - User['yarn'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']} 2018-04-03 05:48:24,883 - User['hcat'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']} 2018-04-03 05:48:24,884 - User['mapred'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']} 2018-04-03 05:48:24,885 - File['/var/lib/ambari-agent/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555} 2018-04-03 05:48:24,888 - Execute['/var/lib/ambari-agent/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa'] {'not_if': '(test $(id -u ambari-qa) -gt 1000) || (false)'} 2018-04-03 05:48:24,893 - Skipping Execute['/var/lib/ambari-agent/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa'] due to not_if 2018-04-03 05:48:24,894 - Group['hdfs'] {} 2018-04-03 05:48:24,894 - User['hdfs'] {'fetch_nonlocal_groups': True, 'groups': ['hadoop', 'hdfs']} 2018-04-03 05:48:24,895 - FS Type: 2018-04-03 05:48:24,895 - Directory['/etc/hadoop'] {'mode': 0755} 2018-04-03 05:48:24,919 - File['/usr/hdp/current/hadoop-client/conf/hadoop-env.sh'] {'content': InlineTemplate(...), 'owner': 'hdfs', 'group': 'hadoop'} 2018-04-03 05:48:24,920 - Directory['/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir'] {'owner': 'hdfs', 'group': 'hadoop', 'mode': 0777} 2018-04-03 05:48:24,937 - Repository['HDP-2.3'] {'base_url': 'http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.3.6.0', 'action': ['create'], 'components': ['HDP', 'main'], 'repo_template': '[{{repo_id}}]\nname={{repo_id}}\n{% if mirror_list %}mirrorlist={{mirror_list}}{% else %}baseurl={{base_url}}{% endif %}\n\npath=/\nenabled=1\ngpgcheck=0', 'repo_file_name': 'HDP', 'mirror_list': None} 2018-04-03 05:48:24,949 - File['/etc/yum.repos.d/HDP.repo'] {'content': '[HDP-2.3]\nname=HDP-2.3\nbaseurl=http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.3.6.0\n\npath=/\nenabled=1\ngpgcheck=0'} 2018-04-03 05:48:24,950 - Repository['HDP-UTILS-1.1.0.20'] {'base_url': 'http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.20/repos/centos6', 'action': ['create'], 'components': ['HDP-UTILS', 'main'], 'repo_template': '[{{repo_id}}]\nname={{repo_id}}\n{% if mirror_list %}mirrorlist={{mirror_list}}{% else %}baseurl={{base_url}}{% endif %}\n\npath=/\nenabled=1\ngpgcheck=0', 'repo_file_name': 'HDP-UTILS', 'mirror_list': None} 2018-04-03 05:48:24,955 - File['/etc/yum.repos.d/HDP-UTILS.repo'] {'content': '[HDP-UTILS-1.1.0.20]\nname=HDP-UTILS-1.1.0.20\nbaseurl=http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.20/repos/centos6\n\npath=/\nenabled=1\ngpgcheck=0'} 2018-04-03 05:48:24,956 - Package['unzip'] {'retry_on_repo_unavailability': False, 'retry_count': 5} 2018-04-03 05:48:25,089 - Skipping installation of existing package unzip 2018-04-03 05:48:25,089 - Package['curl'] {'retry_on_repo_unavailability': False, 'retry_count': 5} 2018-04-03 05:48:25,109 - Skipping installation of existing package curl 2018-04-03 05:48:25,110 - Package['hdp-select'] {'retry_on_repo_unavailability': False, 'retry_count': 5} 2018-04-03 05:48:25,130 - Skipping installation of existing package hdp-select 2018-04-03 05:48:25,355 - Package['ambari-metrics-monitor'] {'retry_on_repo_unavailability': False, 'retry_count': 5} 2018-04-03 05:48:25,488 - Installing package ambari-metrics-monitor ('/usr/bin/yum -d 0 -e 0 -y install ambari-metrics-monitor')

21 REPLIES 21

Super Mentor

@Krupal Jagtap

The problem is this: (Ambari Version and the Ambari Metrics Component version should be same)

# rpm -qa | grep ambari
 ambari-metrics-collector-2.1.0-1470.x86_64
 ambari-metrics-monitor-2.1.0-1470.x86_64
 ambari-server-2.2.2.0-460.x86_64 
 ambari-metrics-hadoop-sink-2.1.0-1470.x86_64 
 ambari-agent-2.2.2.0-460.x86_64

.

Looks like you have not performed the Amabri Post upgrade steps hence your AMS binaries are still Old (2.1.0). where as ambari binaries are 2.2.2

1. Stop the AMS collector Service from ambari UI and then perform the AMS post upgrade steps: https://docs.hortonworks.com/HDPDocuments/Ambari-2.2.2.0/bk_upgrading_Ambari/content/_upgrade_ambari...

Please verify that you have the correct ambari.repo: (repo should be from 2.2.2 version NOT from 2.1.0)

# cat /etc/yum.repos.d/ambari.repo | grep 2.2.2

2. So please do this on all the hosts where you see AMS binary version as 2.1.0

# yum clean all
# yum upgrade ambari-metrics-monitor ambari-metrics-hadoop-sink

.

3. And on the host where Ambari Metrics Collector is installed Please do this:

# yum upgrade ambari-metrics-collector

.

View solution in original post

It worked. Thanks a ton Jay Kumar SenSharma . appreciate the patience from your side