Support Questions

Find answers, ask questions, and share your expertise

Error while restarting ambari metrics monitor

avatar
 
1 ACCEPTED SOLUTION

avatar
Master Mentor

@saivenkatg55 

Good to know that your issue is resolved.
If your question is answered then, Please make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

View solution in original post

5 REPLIES 5

avatar

Please find the logs 

Please find the logs

HDP version - 3.0

Traceback (most recent call last):
  File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/AMBARI_METRICS/package/scripts/metrics_monitor.py", line 78, in <module>
    AmsMonitor().execute()
  File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 351, in execute
    method(env)
  File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/AMBARI_METRICS/package/scripts/metrics_monitor.py", line 43, in start
    action = 'start'
  File "/usr/lib/ambari-agent/lib/ambari_commons/os_family_impl.py", line 89, in thunk
    return fn(*args, **kwargs)
  File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/AMBARI_METRICS/package/scripts/ams_service.py", line 109, in ams_service
    user=params.ams_user
  File "/usr/lib/ambari-agent/lib/resource_management/core/base.py", line 166, in __init__
    self.env.run()
  File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 160, in run
    self.run_action(resource, action)
  File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 124, in run_action
    provider_action()
  File "/usr/lib/ambari-agent/lib/resource_management/core/providers/system.py", line 263, in action_run
    returns=self.resource.returns)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 72, in inner
    result = function(command, **kwargs)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 102, in checked_call
    tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy, returns=returns)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 150, in _call_wrapper
    result = _call(command, **kwargs_copy)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 314, in _call
    raise ExecutionFailed(err_msg, code, out, err)
resource_management.core.exceptions.ExecutionFailed: Execution of '/usr/sbin/ambari-metrics-monitor --config /etc/ambari-metrics-monitor/conf start' returned 255. psutil build directory is not empty, continuing...
Verifying Python version compatibility...
Using python  /usr/bin/python2.7
Checking for previously running Metric Monitor...
/var/run/ambari-metrics-monitor/ambari-metrics-monitor.pid found with no process. Removing 30543...
Starting ambari-metrics-monitor
Verifying ambari-metrics-monitor process status with PID : 8619
Output of PID check : 
ERROR: ambari-metrics-monitor start failed. For more details, see /var/log/ambari-metrics-monitor/ambari-metrics-monitor.out:
====================
    rotateLog = logging.handlers.RotatingFileHandler(config.ams_monitor_log_file(), "a", 10000000, 25)
  File "/usr/lib64/python2.7/logging/handlers.py", line 117, in __init__
    BaseRotatingHandler.__init__(self, filename, mode, encoding, delay)
  File "/usr/lib64/python2.7/logging/handlers.py", line 64, in __init__
    logging.FileHandler.__init__(self, filename, mode, encoding, delay)
  File "/usr/lib64/python2.7/logging/__init__.py", line 902, in __init__
    StreamHandler.__init__(self, self._open())
  File "/usr/lib64/python2.7/logging/__init__.py", line 925, in _open
    stream = open(self.baseFilename, self.mode)
IOError: [Errno 13] Permission denied: '/var/log/ambari-metrics-monitor/ambari-metrics-monitor.log'
====================
Monitor out at: /var/log/ambari-metrics-monitor/ambari-metrics-monitor.out

stdout:   /var/lib/ambari-agent/data/output-5495.txt

2019-09-30 06:47:33,433 - Stack Feature Version Info: Cluster Stack=3.0, Command Stack=None, Command Version=3.0.1.0-187 -> 3.0.1.0-187
2019-09-30 06:47:33,454 - Using hadoop conf dir: /usr/hdp/3.0.1.0-187/hadoop/conf
2019-09-30 06:47:33,692 - Stack Feature Version Info: Cluster Stack=3.0, Command Stack=None, Command Version=3.0.1.0-187 -> 3.0.1.0-187
2019-09-30 06:47:33,699 - Using hadoop conf dir: /usr/hdp/3.0.1.0-187/hadoop/conf
2019-09-30 06:47:33,701 - Group['livy'] {}
2019-09-30 06:47:33,703 - Group['spark'] {}
2019-09-30 06:47:33,703 - Group['ranger'] {}
2019-09-30 06:47:33,703 - Group['nifiregistry'] {}
2019-09-30 06:47:33,703 - Group['hdfs'] {}
2019-09-30 06:47:33,704 - Group['hadoop'] {}
2019-09-30 06:47:33,704 - Group['nifi'] {}
2019-09-30 06:47:33,704 - Group['users'] {}
2019-09-30 06:47:33,705 - User['yarn-ats'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None}
2019-09-30 06:47:33,707 - User['hive'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None}
2019-09-30 06:47:33,708 - User['infra-solr'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None}
2019-09-30 06:47:33,709 - User['zookeeper'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None}
2019-09-30 06:47:33,711 - User['superset'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None}
2019-09-30 06:47:33,712 - User['ams'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None}
2019-09-30 06:47:33,714 - User['ranger'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['ranger', 'hadoop'], 'uid': None}
2019-09-30 06:47:33,715 - User['tez'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop', 'users'], 'uid': None}
2019-09-30 06:47:33,717 - User['nifiregistry'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['nifiregistry'], 'uid': None}
2019-09-30 06:47:33,718 - User['nifi'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['nifi'], 'uid': None}
2019-09-30 06:47:33,720 - User['livy'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['livy', 'hadoop'], 'uid': None}
2019-09-30 06:47:33,721 - User['spark'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['spark', 'hadoop'], 'uid': None}
2019-09-30 06:47:33,723 - User['ambari-qa'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop', 'users'], 'uid': None}
2019-09-30 06:47:33,724 - User['hdfs'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hdfs', 'hadoop'], 'uid': None}
2019-09-30 06:47:33,726 - User['sqoop'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None}
2019-09-30 06:47:33,727 - User['yarn'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None}
2019-09-30 06:47:33,729 - User['mapred'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None}
2019-09-30 06:47:33,730 - File['/var/lib/ambari-agent/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555}
2019-09-30 06:47:33,732 - Execute['/var/lib/ambari-agent/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa 0'] {'not_if': '(test $(id -u ambari-qa) -gt 1000) || (false)'}
2019-09-30 06:47:33,741 - Skipping Execute['/var/lib/ambari-agent/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa 0'] due to not_if
2019-09-30 06:47:33,742 - Group['hdfs'] {}
2019-09-30 06:47:33,742 - User['hdfs'] {'fetch_nonlocal_groups': True, 'groups': ['hdfs', 'hadoop', u'hdfs']}
2019-09-30 06:47:33,743 - FS Type: HDFS
2019-09-30 06:47:33,744 - Directory['/etc/hadoop'] {'mode': 0755}
2019-09-30 06:47:33,760 - File['/usr/hdp/3.0.1.0-187/hadoop/conf/hadoop-env.sh'] {'content': InlineTemplate(...), 'owner': 'hdfs', 'group': 'hadoop'}
2019-09-30 06:47:33,761 - Directory['/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir'] {'owner': 'hdfs', 'group': 'hadoop', 'mode': 01777}
2019-09-30 06:47:33,784 - Execute[('setenforce', '0')] {'not_if': '(! which getenforce ) || (which getenforce && getenforce | grep -q Disabled)', 'sudo': True, 'only_if': 'test -f /selinux/enforce'}
2019-09-30 06:47:33,794 - Skipping Execute[('setenforce', '0')] due to not_if
2019-09-30 06:47:33,795 - Directory['/var/log/hadoop'] {'owner': 'root', 'create_parents': True, 'group': 'hadoop', 'mode': 0775, 'cd_access': 'a'}
2019-09-30 06:47:33,798 - Directory['/var/run/hadoop'] {'owner': 'root', 'create_parents': True, 'group': 'root', 'cd_access': 'a'}
2019-09-30 06:47:33,799 - Directory['/var/run/hadoop/hdfs'] {'owner': 'hdfs', 'cd_access': 'a'}
2019-09-30 06:47:33,799 - Directory['/tmp/hadoop-hdfs'] {'owner': 'hdfs', 'create_parents': True, 'cd_access': 'a'}
2019-09-30 06:47:33,806 - File['/usr/hdp/3.0.1.0-187/hadoop/conf/commons-logging.properties'] {'content': Template('commons-logging.properties.j2'), 'owner': 'hdfs'}
2019-09-30 06:47:33,809 - File['/usr/hdp/3.0.1.0-187/hadoop/conf/health_check'] {'content': Template('health_check.j2'), 'owner': 'hdfs'}
2019-09-30 06:47:33,820 - File['/usr/hdp/3.0.1.0-187/hadoop/conf/log4j.properties'] {'content': InlineTemplate(...), 'owner': 'hdfs', 'group': 'hadoop', 'mode': 0644}
2019-09-30 06:47:33,842 - File['/usr/hdp/3.0.1.0-187/hadoop/conf/hadoop-metrics2.properties'] {'content': InlineTemplate(...), 'owner': 'hdfs', 'group': 'hadoop'}
2019-09-30 06:47:33,844 - File['/usr/hdp/3.0.1.0-187/hadoop/conf/task-log4j.properties'] {'content': StaticFile('task-log4j.properties'), 'mode': 0755}
2019-09-30 06:47:33,846 - File['/usr/hdp/3.0.1.0-187/hadoop/conf/configuration.xsl'] {'owner': 'hdfs', 'group': 'hadoop'}
2019-09-30 06:47:33,854 - File['/etc/hadoop/conf/topology_mappings.data'] {'owner': 'hdfs', 'content': Template('topology_mappings.data.j2'), 'only_if': 'test -d /etc/hadoop/conf', 'group': 'hadoop', 'mode': 0644}
2019-09-30 06:47:33,861 - File['/etc/hadoop/conf/topology_script.py'] {'content': StaticFile('topology_script.py'), 'only_if': 'test -d /etc/hadoop/conf', 'mode': 0755}
2019-09-30 06:47:33,868 - Skipping unlimited key JCE policy check and setup since it is not required
2019-09-30 06:47:33,879 - Skipping stack-select on AMBARI_METRICS because it does not exist in the stack-select package structure.
2019-09-30 06:47:34,219 - Using hadoop conf dir: /usr/hdp/3.0.1.0-187/hadoop/conf
2019-09-30 06:47:34,223 - checked_call['hostid'] {}
2019-09-30 06:47:34,228 - checked_call returned (0, '310aafc2')
2019-09-30 06:47:34,231 - Directory['/etc/ambari-metrics-monitor/conf'] {'owner': 'ams', 'group': 'hadoop', 'create_parents': True}
2019-09-30 06:47:34,233 - Directory['/var/log/ambari-metrics-monitor'] {'owner': 'ams', 'group': 'hadoop', 'create_parents': True, 'mode': 0755}
2019-09-30 06:47:34,234 - Execute['ambari-sudo.sh chown -R ams:hadoop /var/log/ambari-metrics-monitor'] {}
2019-09-30 06:47:34,242 - Directory['/var/run/ambari-metrics-monitor'] {'owner': 'ams', 'group': 'hadoop', 'create_parents': True, 'mode': 0755, 'cd_access': 'a'}
2019-09-30 06:47:34,243 - Directory['/usr/lib/python2.6/site-packages/resource_monitoring/psutil/build'] {'owner': 'ams', 'group': 'hadoop', 'create_parents': True, 'cd_access': 'a'}
2019-09-30 06:47:34,244 - Execute['ambari-sudo.sh chown -R ams:hadoop /usr/lib/python2.6/site-packages/resource_monitoring'] {}
2019-09-30 06:47:34,254 - TemplateConfig['/etc/ambari-metrics-monitor/conf/metric_monitor.ini'] {'owner': 'ams', 'template_tag': None, 'group': 'hadoop'}
2019-09-30 06:47:34,263 - File['/etc/ambari-metrics-monitor/conf/metric_monitor.ini'] {'content': Template('metric_monitor.ini.j2'), 'owner': 'ams', 'group': 'hadoop', 'mode': None}
2019-09-30 06:47:34,263 - TemplateConfig['/etc/ambari-metrics-monitor/conf/metric_groups.conf'] {'owner': 'ams', 'template_tag': None, 'group': 'hadoop'}
2019-09-30 06:47:34,265 - File['/etc/ambari-metrics-monitor/conf/metric_groups.conf'] {'content': Template('metric_groups.conf.j2'), 'owner': 'ams', 'group': 'hadoop', 'mode': None}
2019-09-30 06:47:34,271 - File['/etc/ambari-metrics-monitor/conf/ams-env.sh'] {'content': InlineTemplate(...), 'owner': 'ams'}
2019-09-30 06:47:34,277 - Directory['/usr/lib/ambari-logsearch-logfeeder/conf'] {'create_parents': True, 'mode': 0755, 'cd_access': 'a'}
2019-09-30 06:47:34,278 - Generate Log Feeder config file: /usr/lib/ambari-logsearch-logfeeder/conf/input.config-ambari-metrics.json
2019-09-30 06:47:34,278 - File['/usr/lib/ambari-logsearch-logfeeder/conf/input.config-ambari-metrics.json'] {'content': Template('input.config-ambari-metrics.json.j2'), 'mode': 0644}
2019-09-30 06:47:34,280 - Execute['/usr/sbin/ambari-metrics-monitor --config /etc/ambari-metrics-monitor/conf start'] {'user': 'ams'}
2019-09-30 06:47:36,459 - Execute['find /var/log/ambari-metrics-monitor -maxdepth 1 -type f -name '*' -exec echo '==> {} <==' \; -exec tail -n 40 {} \;'] {'logoutput': True, 'ignore_failures': True, 'user': 'ams'}
==> /var/log/ambari-metrics-monitor/ambari-metrics-monitor.out <==
    StreamHandler.__init__(self, self._open())
  File "/usr/lib64/python2.7/logging/__init__.py", line 925, in _open
    stream = open(self.baseFilename, self.mode)
IOError: [Errno 13] Permission denied: '/var/log/ambari-metrics-monitor/ambari-metrics-monitor.log'
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/resource_monitoring/main.py", line 107, in <module>
    main()
  File "/usr/lib/python2.6/site-packages/resource_monitoring/main.py", line 57, in main
    server_process_main(stop_handler)
  File "/usr/lib/python2.6/site-packages/resource_monitoring/main.py", line 63, in server_process_main
    _init_logging(main_config)
  File "/usr/lib/python2.6/site-packages/resource_monitoring/main.py", line 101, in _init_logging
    rotateLog = logging.handlers.RotatingFileHandler(config.ams_monitor_log_file(), "a", 10000000, 25)
  File "/usr/lib64/python2.7/logging/handlers.py", line 117, in __init__
    BaseRotatingHandler.__init__(self, filename, mode, encoding, delay)
  File "/usr/lib64/python2.7/logging/handlers.py", line 64, in __init__
    logging.FileHandler.__init__(self, filename, mode, encoding, delay)
  File "/usr/lib64/python2.7/logging/__init__.py", line 902, in __init__
    StreamHandler.__init__(self, self._open())
  File "/usr/lib64/python2.7/logging/__init__.py", line 925, in _open
    stream = open(self.baseFilename, self.mode)
IOError: [Errno 13] Permission denied: '/var/log/ambari-metrics-monitor/ambari-metrics-monitor.log'
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/resource_monitoring/main.py", line 107, in <module>
    main()
  File "/usr/lib/python2.6/site-packages/resource_monitoring/main.py", line 57, in main
    server_process_main(stop_handler)
  File "/usr/lib/python2.6/site-packages/resource_monitoring/main.py", line 63, in server_process_main
    _init_logging(main_config)
  File "/usr/lib/python2.6/site-packages/resource_monitoring/main.py", line 101, in _init_logging
    rotateLog = logging.handlers.RotatingFileHandler(config.ams_monitor_log_file(), "a", 10000000, 25)
  File "/usr/lib64/python2.7/logging/handlers.py", line 117, in __init__
    BaseRotatingHandler.__init__(self, filename, mode, encoding, delay)
  File "/usr/lib64/python2.7/logging/handlers.py", line 64, in __init__
    logging.FileHandler.__init__(self, filename, mode, encoding, delay)
  File "/usr/lib64/python2.7/logging/__init__.py", line 902, in __init__
    StreamHandler.__init__(self, self._open())
  File "/usr/lib64/python2.7/logging/__init__.py", line 925, in _open
    stream = open(self.baseFilename, self.mode)
IOError: [Errno 13] Permission denied: '/var/log/ambari-metrics-monitor/ambari-metrics-monitor.log'
==> /var/log/ambari-metrics-monitor/ambari-metrics-monitor.log.4 <==
2019-01-05 14:19:19,844 [INFO] emitter.py:210 - Calculated collector shard based on hostname : xxxxxxxxx

 

avatar

the above host is not able to write the logs in /var/log/ambari-metrics-monitor, don't find the log file for today.

total 42M
-rw-r--r-- 1 ams hadoop 9.6M Jan 5 2019 ambari-metrics-monitor.log.4
-rw-r--r-- 1 ams hadoop 9.6M Mar 7 2019 ambari-metrics-monitor.log.3
-rw-r--r-- 1 ams hadoop 9.6M May 7 11:34 ambari-metrics-monitor.log.2
-rw-r--r-- 1 ams hadoop 9.6M Aug 2 21:06 ambari-metrics-monitor.log.1
-r--r--r-- 1 ams hadoop 3.1M Aug 22 13:43 ambari-metrics-monitor.log
-rw-r--r-- 1 ams hadoop 9.3K Sep 30 06:47 ambari-metrics-monitor.out

avatar
Master Mentor

@saivenkatg55 


As we see the error indicates a Permission issue

IOError: [Errno 13] Permission denied: '/var/log/ambari-metrics-monitor/ambari-metrics-monitor.log'

 

The permission on the file does not seems to be accurate as it does not have WRITE permission.

-r--r--r-- 1 ams hadoop 3.1M Aug 22 13:43 ambari-metrics-monitor.log

 .

Please try the following command to change the permission and then try to restart metrics monitor again

# chmod 644 /var/log/ambari-metrics-monitor/ambari-metrics-monitor.log

.

Can you try changing it to following and then try again:

 

avatar

Thanks! it is working now 

avatar
Master Mentor

@saivenkatg55 

Good to know that your issue is resolved.
If your question is answered then, Please make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.