Created 02-08-2017 01:34 PM
/var/log/ambari-agent/ambari-agent.log
NFO 2017-02-08 19:02:12,700 logger.py:71 - call returned (0, '') INFO 2017-02-08 19:02:12,700 logger.py:71 - call returned (0, '') INFO 2017-02-08 19:02:12,700 logger.py:71 - call[['test', '-w', '/home']] {'sudo': True, 'timeout': 5} INFO 2017-02-08 19:02:12,700 logger.py:71 - call[['test', '-w', '/home']] {'sudo': True, 'timeout': 5} INFO 2017-02-08 19:02:12,705 logger.py:71 - call returned (0, '') INFO 2017-02-08 19:02:12,705 logger.py:71 - call returned (0, '') INFO 2017-02-08 19:02:12,705 logger.py:71 - call[['test', '-w', '/run/user/1000']] {'sudo': True, 'timeout': 5} INFO 2017-02-08 19:02:12,705 logger.py:71 - call[['test', '-w', '/run/user/1000']] {'sudo': True, 'timeout': 5} INFO 2017-02-08 19:02:12,710 logger.py:71 - call returned (0, '') INFO 2017-02-08 19:02:12,710 logger.py:71 - call returned (0, '') ERROR 2017-02-08 19:02:43,707 script_alert.py:119 - [Alert][hive_webhcat_server_status] Failed with result CRITICAL: ['Connection failed to http://localhost.localdomain:50111/templeton/v1/status?user.name=ambari-qa + \nTraceback (most recent call last):\n File "/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/alerts/alert_webhcat_server.py", line 190, in execute\n url_response = urllib2.urlopen(query_url, timeout=connection_timeout)\n File "/usr/lib64/python2.7/urllib2.py", line 154, in urlopen\n return opener.open(url, data, timeout)\n File "/usr/lib64/python2.7/urllib2.py", line 431, in open\n response = self._open(req, data)\n File "/usr/lib64/python2.7/urllib2.py", line 449, in _open\n \'_open\', req)\n File "/usr/lib64/python2.7/urllib2.py", line 409, in _call_chain\n result = func(*args)\n File "/usr/lib64/python2.7/urllib2.py", line 1244, in http_open\n return self.do_open(httplib.HTTPConnection, req)\n File "/usr/lib64/python2.7/urllib2.py", line 1214, in do_open\n raise URLError(err)\nURLError: <urlopen error [Errno 111] Connection refused>\n'] ERROR 2017-02-08 19:02:43,707 script_alert.py:119 - [Alert][hive_webhcat_server_status] Failed with result CRITICAL: ['Connection failed to http://localhost.localdomain:50111/templeton/v1/status?user.name=ambari-qa + \nTraceback (most recent call last):\n File "/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/alerts/alert_webhcat_server.py", line 190, in execute\n url_response = urllib2.urlopen(query_url, timeout=connection_timeout)\n File "/usr/lib64/python2.7/urllib2.py", line 154, in urlopen\n return opener.open(url, data, timeout)\n File "/usr/lib64/python2.7/urllib2.py", line 431, in open\n response = self._open(req, data)\n File "/usr/lib64/python2.7/urllib2.py", line 449, in _open\n \'_open\', req)\n File "/usr/lib64/python2.7/urllib2.py", line 409, in _call_chain\n result = func(*args)\n File "/usr/lib64/python2.7/urllib2.py", line 1244, in http_open\n return self.do_open(httplib.HTTPConnection, req)\n File "/usr/lib64/python2.7/urllib2.py", line 1214, in do_open\n raise URLError(err)\nURLError: <urlopen error [Errno 111] Connection refused>\n'] ERROR 2017-02-08 19:02:43,710 script_alert.py:119 - [Alert][yarn_nodemanager_health] Failed with result CRITICAL: ['Connection failed to http://localhost.localdomain:8042/ws/v1/node/info (Traceback (most recent call last):\n File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/alerts/alert_nodemanager_health.py", line 171, in execute\n url_response = urllib2.urlopen(query, timeout=connection_timeout)\n File "/usr/lib64/python2.7/urllib2.py", line 154, in urlopen\n return opener.open(url, data, timeout)\n File "/usr/lib64/python2.7/urllib2.py", line 431, in open\n response = self._open(req, data)\n File "/usr/lib64/python2.7/urllib2.py", line 449, in _open\n \'_open\', req)\n File "/usr/lib64/python2.7/urllib2.py", line 409, in _call_chain\n result = func(*args)\n File "/usr/lib64/python2.7/urllib2.py", line 1244, in http_open\n return self.do_open(httplib.HTTPConnection, req)\n File "/usr/lib64/python2.7/urllib2.py", line 1214, in do_open\n raise URLError(err)\nURLError: <urlopen error [Errno 111] Connection refused>\n)'] ERROR 2017-02-08 19:02:43,710 script_alert.py:119 - [Alert][yarn_nodemanager_health] Failed with result CRITICAL: ['Connection failed to http://localhost.localdomain:8042/ws/v1/node/info (Traceback (most recent call last):\n File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/alerts/alert_nodemanager_health.py", line 171, in execute\n url_response = urllib2.urlopen(query, timeout=connection_timeout)\n File "/usr/lib64/python2.7/urllib2.py", line 154, in urlopen\n return opener.open(url, data, timeout)\n File "/usr/lib64/python2.7/urllib2.py", line 431, in open\n response = self._open(req, data)\n File "/usr/lib64/python2.7/urllib2.py", line 449, in _open\n \'_open\', req)\n File "/usr/lib64/python2.7/urllib2.py", line 409, in _call_chain\n result = func(*args)\n File "/usr/lib64/python2.7/urllib2.py", line 1244, in http_open\n return self.do_open(httplib.HTTPConnection, req)\n File "/usr/lib64/python2.7/urllib2.py", line 1214, in do_open\n raise URLError(err)\nURLError: <urlopen error [Errno 111] Connection refused>\n)'] WARNING 2017-02-08 19:02:43,714 base_alert.py:134 - [Alert][namenode_directory_status] Unable to execute alert. [Alert][namenode_directory_status] Unable to extract JSON from JMX response WARNING 2017-02-08 19:02:43,715 base_alert.py:134 - [Alert][datanode_health_summary] Unable to execute alert. [Alert][datanode_health_summary] Unable to extract JSON from JMX response WARNING 2017-02-08 19:02:43,718 base_alert.py:134 - [Alert][namenode_hdfs_blocks_health] Unable to execute alert. [Alert][namenode_hdfs_blocks_health] Unable to extract JSON from JMX response WARNING 2017-02-08 19:02:43,720 base_alert.py:134 - [Alert][namenode_hdfs_capacity_utilization] Unable to execute alert. [Alert][namenode_hdfs_capacity_utilization] Unable to extract JSON from JMX response WARNING 2017-02-08 19:02:43,725 base_alert.py:134 - [Alert][datanode_storage] Unable to execute alert. [Alert][datanode_storage] Unable to extract JSON from JMX response INFO 2017-02-08 19:02:43,735 logger.py:71 - Mount point for directory /hadoop/hdfs/data is / WARNING 2017-02-08 19:02:43,727 base_alert.py:134 - [Alert][namenode_rpc_latency] Unable to execute alert. [Alert][namenode_rpc_latency] Unable to extract JSON from JMX response WARNING 2017-02-08 19:02:43,733 base_alert.py:134 - [Alert][namenode_hdfs_pending_deletion_blocks] Unable to execute alert. [Alert][namenode_hdfs_pending_deletion_blocks] Unable to extract JSON from JMX response INFO 2017-02-08 19:02:43,735 logger.py:71 - Mount point for directory /hadoop/hdfs/data is / WARNING 2017-02-08 19:02:43,743 base_alert.py:134 - [Alert][datanode_heap_usage] Unable to execute alert. [Alert][datanode_heap_usage] Unable to extract JSON from JMX response INFO 2017-02-08 19:02:43,751 logger.py:71 - Pid file /var/run/ambari-metrics-monitor/ambari-metrics-monitor.pid is empty or does not exist INFO 2017-02-08 19:02:43,751 logger.py:71 - Pid file /var/run/ambari-metrics-monitor/ambari-metrics-monitor.pid is empty or does not exist ERROR 2017-02-08 19:02:43,752 script_alert.py:119 - [Alert][ams_metrics_monitor_process] Failed with result CRITICAL: ['Ambari Monitor is NOT running on localhost.localdomain'] ERROR 2017-02-08 19:02:43,752 script_alert.py:119 - [Alert][ams_metrics_monitor_process] Failed with result CRITICAL: ['Ambari Monitor is NOT running on localhost.localdomain'] INFO 2017-02-08 19:02:43,753 logger.py:71 - Execute['source /etc/oozie/conf/oozie-env.sh ; oozie admin -oozie http://localhost.localdomain:11000/oozie -status'] {'environment': None, 'user': 'oozie'} INFO 2017-02-08 19:02:43,753 logger.py:71 - Execute['source /etc/oozie/conf/oozie-env.sh ; oozie admin -oozie http://localhost.localdomain:11000/oozie -status'] {'environment': None, 'user': 'oozie'} ERROR 2017-02-08 19:02:58,979 script_alert.py:119 - [Alert][oozie_server_status] Failed with result CRITICAL: ["Execution of 'source /etc/oozie/conf/oozie-env.sh ; oozie admin -oozie http://localhost.localdomain:11000/oozie -status' returned 255. -bash: /etc/oozie/conf/oozie-env.sh: Too many levels of symbolic links\nConnection exception has occurred [ java.net.ConnectException Connection refused (Connection refused) ]. Trying after 1 sec. Retry count = 1\nConnection exception has occurred [ java.net.ConnectException Connection refused (Connection refused) ]. Trying after 2 sec. Retry count = 2\nConnection exception has occurred [ java.net.ConnectException Connection refused (Connection refused) ]. Trying after 4 sec. Retry count = 3\nConnection exception has occurred [ java.net.ConnectException Connection refused (Connection refused) ]. Trying after 8 sec. Retry count = 4\nError: IO_ERROR : java.io.IOException: Error while connecting Oozie server. No of retries = 4. Exception = Connection refused (Connection refused)"] ERROR 2017-02-08 19:02:58,979 script_alert.py:119 - [Alert][oozie_server_status] Failed with result CRITICAL: ["Execution of 'source /etc/oozie/conf/oozie-env.sh ; oozie admin -oozie http://localhost.localdomain:11000/oozie -status' returned 255. -bash: /etc/oozie/conf/oozie-env.sh: Too many levels of symbolic links\nConnection exception has occurred [ java.net.ConnectException Connection refused (Connection refused) ]. Trying after 1 sec. Retry count = 1\nConnection exception has occurred [ java.net.ConnectException Connection refused (Connection refused) ]. Trying after 2 sec. Retry count = 2\nConnection exception has occurred [ java.net.ConnectException Connection refused (Connection refused) ]. Trying after 4 sec. Retry count = 3\nConnection exception has occurred [ java.net.ConnectException Connection refused (Connection refused) ]. Trying after 8 sec. Retry count = 4\nError: IO_ERROR : java.io.IOException: Error while connecting Oozie server. No of retries = 4. Exception = Connection refused (Connection refused)"]
Created on 02-08-2017 01:37 PM - edited 08-18-2019 04:14 AM
Cant start any of the applications
Created 02-08-2017 02:11 PM
By any chance the hostname is changes? Or was it "localhost.localdomain" earlier as well? Are not you using proper FQDN?
Failed with result CRITICAL: ['Connection failed to http://localhost.localdomain.
.
- Regarding the following issue, Please try to reinstall oozie clients to those hosts.
-bash: /etc/oozie/conf/oozie-env.sh: Too many levels of symbolic links
.
- Also please check if the output of "hdp-select" command is showing correct version ? You might want to refer to: https://community.hortonworks.com/questions/55149/hdp-250-too-many-levels-of-symbolic-links-when-ins...