Created on 04-25-2018 10:38 AM - edited 04-25-2018 10:58 AM
After a data center outage, my Cloudera cluster running CDH 5.8.5 was not functioning properly, so I took these steps:
Any help would be appreciated here. Thank you!
[25/Apr/2018 17:25:41 +0000] 3508 MainThread agent INFO SCM Agent Version: 5.8.2 [25/Apr/2018 17:25:41 +0000] 3508 MainThread agent INFO Agent Protocol Version: 4 [25/Apr/2018 17:25:41 +0000] 3508 MainThread agent INFO Using Host ID: 842ca2b2-6e0a-422b-be27-1f8270defb60 [25/Apr/2018 17:25:41 +0000] 3508 MainThread agent INFO Using directory: /run/cloudera-scm-agent [25/Apr/2018 17:25:41 +0000] 3508 MainThread agent INFO Using supervisor binary path: /usr/lib/cmf/agent/build/env/bin/supervisord [25/Apr/2018 17:25:41 +0000] 3508 MainThread agent INFO Neither verify_cert_file nor verify_cert_dir are configured. Not performing validation of server certificates in HTTPS communication. These options can be configured in this agent's config.ini file to enable certificate validation. [25/Apr/2018 17:25:41 +0000] 3508 MainThread agent INFO Agent Logging Level: INFO [25/Apr/2018 17:25:41 +0000] 3508 MainThread agent INFO No command line vars [25/Apr/2018 17:25:41 +0000] 3508 MainThread agent INFO Found database jar: /usr/share/java/mysql-connector-java.jar [25/Apr/2018 17:25:41 +0000] 3508 MainThread agent INFO Missing database jar: /usr/share/java/oracle-connector-java.jar (normal, if you're not using this database type) [25/Apr/2018 17:25:41 +0000] 3508 MainThread agent INFO Found database jar: /usr/share/cmf/lib/postgresql-9.0-801.jdbc4.jar [25/Apr/2018 17:25:41 +0000] 3508 MainThread agent INFO Agent starting as pid 3508 user root(0) group root(0). [25/Apr/2018 17:25:43 +0000] 3508 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent/cgroups [25/Apr/2018 17:25:43 +0000] 3508 MainThread cgroups INFO Found cgroups subsystem: cpu [25/Apr/2018 17:25:43 +0000] 3508 MainThread cgroups INFO cgroup pseudofile /tmp/tmpmddUFx/cpu.rt_runtime_us does not exist, skipping [25/Apr/2018 17:25:43 +0000] 3508 MainThread cgroups INFO Found cgroups subsystem: cpuacct [25/Apr/2018 17:25:43 +0000] 3508 MainThread cgroups INFO Found cgroups subsystem: memory [25/Apr/2018 17:25:43 +0000] 3508 MainThread cgroups INFO Found cgroups subsystem: blkio [25/Apr/2018 17:25:43 +0000] 3508 MainThread cgroups INFO Reusing /run/cloudera-scm-agent/cgroups/memory [25/Apr/2018 17:25:43 +0000] 3508 MainThread cgroups INFO Reusing /run/cloudera-scm-agent/cgroups/cpu [25/Apr/2018 17:25:43 +0000] 3508 MainThread cgroups INFO Reusing /run/cloudera-scm-agent/cgroups/cpuacct [25/Apr/2018 17:25:43 +0000] 3508 MainThread cgroups INFO Reusing /run/cloudera-scm-agent/cgroups/blkio [25/Apr/2018 17:25:43 +0000] 3508 MainThread agent INFO Found cgroups capabilities: {'has_memory': True, 'default_memory_limit_in_bytes': -1, 'default_memory_soft_limit_in_bytes': -1, 'writable_cgroup_dot_procs': True, 'default_cpu_rt_runtime_us': -1, 'has_cpu': True, 'default_blkio_weight': 1000, 'default_cpu_shares': 1024, 'has_cpuacct': True, 'has_blkio': True} [25/Apr/2018 17:25:43 +0000] 3508 MainThread agent INFO Setting up supervisord event monitor. [25/Apr/2018 17:25:43 +0000] 3508 MainThread filesystem_map INFO Monitored nodev filesystem types: ['nfs', 'nfs4', 'tmpfs'] [25/Apr/2018 17:25:43 +0000] 3508 MainThread filesystem_map INFO Using timeout of 2.000000 [25/Apr/2018 17:25:43 +0000] 3508 MainThread filesystem_map INFO Using join timeout of 0.100000 [25/Apr/2018 17:25:43 +0000] 3508 MainThread filesystem_map INFO Using tolerance of 60.000000 [25/Apr/2018 17:25:43 +0000] 3508 MainThread filesystem_map INFO Local filesystem types whitelist: ['ext2', 'ext3', 'ext4'] [25/Apr/2018 17:25:43 +0000] 3508 MainThread kt_renewer INFO Agent wide credential cache set to /run/cloudera-scm-agent/krb5cc_cm_agent_0 [25/Apr/2018 17:25:43 +0000] 3508 MainThread agent INFO Using metrics_url_timeout_seconds of 30.000000 [25/Apr/2018 17:25:43 +0000] 3508 MainThread agent INFO Using task_metrics_timeout_seconds of 5.000000 [25/Apr/2018 17:25:43 +0000] 3508 MainThread agent INFO Using max_collection_wait_seconds of 10.000000 [25/Apr/2018 17:25:43 +0000] 3508 MainThread metrics INFO Importing tasktracker metric schema from file /usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/monitor/tasktracker/schema.json [25/Apr/2018 17:25:43 +0000] 3508 MainThread ntp_monitor INFO Using timeout of 2.000000 [25/Apr/2018 17:25:44 +0000] 3508 MainThread dns_names INFO Using timeout of 30.000000 [25/Apr/2018 17:25:44 +0000] 3508 MainThread __init__ INFO Created DNS monitor. [25/Apr/2018 17:25:44 +0000] 3508 MainThread stacks_collection_manager INFO Using max_uncompressed_file_size_bytes: 5242880 [25/Apr/2018 17:25:44 +0000] 3508 MainThread __init__ INFO Importing metric schema from file /usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/monitor/schema.json [25/Apr/2018 17:25:44 +0000] 3508 MainThread agent INFO Supervised processes will add the following to their environment (in addition to the supervisor's env): {'CDH_PARQUET_HOME': '/usr/lib/parquet', 'JSVC_HOME': '/usr/libexec/bigtop-utils', 'CMF_PACKAGE_DIR': '/usr/lib/cmf/service', 'CDH_HADOOP_BIN': '/usr/bin/hadoop', 'MGMT_HOME': '/usr/share/cmf', 'CDH_IMPALA_HOME': '/usr/lib/impala', 'CDH_YARN_HOME': '/usr/lib/hadoop-yarn', 'CDH_HDFS_HOME': '/usr/lib/hadoop-hdfs', 'PATH': '/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games', 'CDH_HUE_PLUGINS_HOME': '/usr/lib/hadoop', 'CM_STATUS_CODES': u'STATUS_NONE HDFS_DFS_DIR_NOT_EMPTY HBASE_TABLE_DISABLED HBASE_TABLE_ENABLED JOBTRACKER_IN_STANDBY_MODE YARN_RM_IN_STANDBY_MODE', 'KEYTRUSTEE_KP_HOME': '/usr/share/keytrustee-keyprovider', 'CLOUDERA_ORACLE_CONNECTOR_JAR': '/usr/share/java/oracle-connector-java.jar', 'CDH_SQOOP2_HOME': '/usr/lib/sqoop2', 'KEYTRUSTEE_SERVER_HOME': '/usr/lib/keytrustee-server', 'CDH_MR2_HOME': '/usr/lib/hadoop-mapreduce', 'HIVE_DEFAULT_XML': '/etc/hive/conf.dist/hive-default.xml', 'CLOUDERA_POSTGRESQL_JDBC_JAR': '/usr/share/cmf/lib/postgresql-9.0-801.jdbc4.jar', 'CDH_KMS_HOME': '/usr/lib/hadoop-kms', 'CDH_HBASE_HOME': '/usr/lib/hbase', 'CDH_SQOOP_HOME': '/usr/lib/sqoop', 'WEBHCAT_DEFAULT_XML': '/etc/hive-webhcat/conf.dist/webhcat-default.xml', 'CDH_OOZIE_HOME': '/usr/lib/oozie', 'CDH_ZOOKEEPER_HOME': '/usr/lib/zookeeper', 'CDH_HUE_HOME': '/usr/lib/hue', 'CLOUDERA_MYSQL_CONNECTOR_JAR': '/usr/share/java/mysql-connector-java.jar', 'CDH_HBASE_INDEXER_HOME': '/usr/lib/hbase-solr', 'CDH_MR1_HOME': '/usr/lib/hadoop-0.20-mapreduce', 'CDH_SOLR_HOME': '/usr/lib/solr', 'CDH_PIG_HOME': '/usr/lib/pig', 'CDH_SENTRY_HOME': '/usr/lib/sentry', 'CDH_CRUNCH_HOME': '/usr/lib/crunch', 'CDH_LLAMA_HOME': '/usr/lib/llama/', 'CDH_HTTPFS_HOME': '/usr/lib/hadoop-httpfs', 'CDH_HADOOP_HOME': '/usr/lib/hadoop', 'CDH_HIVE_HOME': '/usr/lib/hive', 'CDH_HCAT_HOME': '/usr/lib/hive-hcatalog', 'CDH_KAFKA_HOME': '/usr/lib/kafka', 'CDH_SPARK_HOME': '/usr/lib/spark', 'TOMCAT_HOME': '/usr/lib/bigtop-tomcat', 'CDH_FLUME_HOME': '/usr/lib/flume-ng'} [25/Apr/2018 17:25:44 +0000] 3508 MainThread agent INFO To override these variables, use /etc/cloudera-scm-agent/config.ini. Environment variables for CDH locations are not used when CDH is installed from parcels. [25/Apr/2018 17:25:44 +0000] 3508 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent/process [25/Apr/2018 17:25:44 +0000] 3508 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent/supervisor [25/Apr/2018 17:25:44 +0000] 3508 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent/flood [25/Apr/2018 17:25:44 +0000] 3508 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent/supervisor/include [25/Apr/2018 17:25:44 +0000] 3508 MainThread agent ERROR Failed to connect to previous supervisor. Traceback (most recent call last): File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/agent.py", line 2084, in find_or_start_supervisor self.get_supervisor_process_info() File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/agent.py", line 2230, in get_supervisor_process_info self.identifier = self.supervisor_client.supervisor.getIdentification() File "/usr/lib/python2.7/xmlrpclib.py", line 1233, in __call__ return self.__send(self.__name, args) File "/usr/lib/python2.7/xmlrpclib.py", line 1587, in __request verbose=self.__verbose File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/supervisor-3.0-py2.7.egg/supervisor/xmlrpc.py", line 460, in request self.connection.request('POST', handler, request_body, self.headers) File "/usr/lib/python2.7/httplib.py", line 1017, in request self._send_request(method, url, body, headers) File "/usr/lib/python2.7/httplib.py", line 1051, in _send_request self.endheaders(body) File "/usr/lib/python2.7/httplib.py", line 1013, in endheaders self._send_output(message_body) File "/usr/lib/python2.7/httplib.py", line 864, in _send_output self.send(msg) File "/usr/lib/python2.7/httplib.py", line 826, in send self.connect() File "/usr/lib/python2.7/httplib.py", line 807, in connect self.timeout, self.source_address) File "/usr/lib/python2.7/socket.py", line 571, in create_connection raise err error: [Errno 111] Connection refused [25/Apr/2018 17:25:44 +0000] 3508 MainThread tmpfs INFO Reusing mounted tmpfs at /run/cloudera-scm-agent/process [25/Apr/2018 17:25:45 +0000] 3508 MainThread agent INFO Trying to connect to newly launched supervisor (Attempt 1) [25/Apr/2018 17:25:45 +0000] 3508 MainThread agent ERROR Failed! trying again in 1 second(s) Traceback (most recent call last): File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/agent.py", line 2208, in connect_to_new_supervisor self.get_supervisor_process_info() File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/agent.py", line 2230, in get_supervisor_process_info self.identifier = self.supervisor_client.supervisor.getIdentification() File "/usr/lib/python2.7/xmlrpclib.py", line 1233, in __call__ return self.__send(self.__name, args) File "/usr/lib/python2.7/xmlrpclib.py", line 1587, in __request verbose=self.__verbose File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/supervisor-3.0-py2.7.egg/supervisor/xmlrpc.py", line 460, in request self.connection.request('POST', handler, request_body, self.headers) File "/usr/lib/python2.7/httplib.py", line 1017, in request self._send_request(method, url, body, headers) File "/usr/lib/python2.7/httplib.py", line 1051, in _send_request self.endheaders(body) File "/usr/lib/python2.7/httplib.py", line 1013, in endheaders self._send_output(message_body) File "/usr/lib/python2.7/httplib.py", line 864, in _send_output self.send(msg) File "/usr/lib/python2.7/httplib.py", line 826, in send self.connect() File "/usr/lib/python2.7/httplib.py", line 807, in connect self.timeout, self.source_address) File "/usr/lib/python2.7/socket.py", line 571, in create_connection raise err error: [Errno 111] Connection refused [25/Apr/2018 17:25:46 +0000] 3508 MainThread agent INFO Trying to connect to newly launched supervisor (Attempt 2) [25/Apr/2018 17:25:46 +0000] 3508 MainThread agent ERROR Failed! trying again in 1 second(s) Traceback (most recent call last): File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/agent.py", line 2208, in connect_to_new_supervisor self.get_supervisor_process_info() File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/agent.py", line 2230, in get_supervisor_process_info self.identifier = self.supervisor_client.supervisor.getIdentification() File "/usr/lib/python2.7/xmlrpclib.py", line 1233, in __call__ return self.__send(self.__name, args) File "/usr/lib/python2.7/xmlrpclib.py", line 1587, in __request verbose=self.__verbose File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/supervisor-3.0-py2.7.egg/supervisor/xmlrpc.py", line 460, in request self.connection.request('POST', handler, request_body, self.headers) File "/usr/lib/python2.7/httplib.py", line 1017, in request self._send_request(method, url, body, headers) File "/usr/lib/python2.7/httplib.py", line 1051, in _send_request self.endheaders(body) File "/usr/lib/python2.7/httplib.py", line 1013, in endheaders self._send_output(message_body) File "/usr/lib/python2.7/httplib.py", line 864, in _send_output self.send(msg) File "/usr/lib/python2.7/httplib.py", line 826, in send self.connect() File "/usr/lib/python2.7/httplib.py", line 807, in connect self.timeout, self.source_address) File "/usr/lib/python2.7/socket.py", line 571, in create_connection raise err error: [Errno 111] Connection refused [25/Apr/2018 17:25:47 +0000] 3508 MainThread agent INFO Trying to connect to newly launched supervisor (Attempt 3) [25/Apr/2018 17:25:47 +0000] 3508 MainThread agent ERROR Failed! trying again in 1 second(s) Traceback (most recent call last): File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/agent.py", line 2208, in connect_to_new_supervisor self.get_supervisor_process_info() File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/agent.py", line 2230, in get_supervisor_process_info self.identifier = self.supervisor_client.supervisor.getIdentification() File "/usr/lib/python2.7/xmlrpclib.py", line 1233, in __call__ return self.__send(self.__name, args) File "/usr/lib/python2.7/xmlrpclib.py", line 1587, in __request verbose=self.__verbose File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/supervisor-3.0-py2.7.egg/supervisor/xmlrpc.py", line 460, in request self.connection.request('POST', handler, request_body, self.headers) File "/usr/lib/python2.7/httplib.py", line 1017, in request self._send_request(method, url, body, headers) File "/usr/lib/python2.7/httplib.py", line 1051, in _send_request self.endheaders(body) File "/usr/lib/python2.7/httplib.py", line 1013, in endheaders self._send_output(message_body) File "/usr/lib/python2.7/httplib.py", line 864, in _send_output self.send(msg) File "/usr/lib/python2.7/httplib.py", line 826, in send self.connect() File "/usr/lib/python2.7/httplib.py", line 807, in connect self.timeout, self.source_address) File "/usr/lib/python2.7/socket.py", line 571, in create_connection raise err error: [Errno 111] Connection refused [25/Apr/2018 17:25:48 +0000] 3508 MainThread agent INFO Trying to connect to newly launched supervisor (Attempt 4) [25/Apr/2018 17:25:48 +0000] 3508 MainThread agent ERROR Failed! trying again in 1 second(s) Traceback (most recent call last): File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/agent.py", line 2208, in connect_to_new_supervisor self.get_supervisor_process_info() File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/agent.py", line 2230, in get_supervisor_process_info self.identifier = self.supervisor_client.supervisor.getIdentification() File "/usr/lib/python2.7/xmlrpclib.py", line 1233, in __call__ return self.__send(self.__name, args) File "/usr/lib/python2.7/xmlrpclib.py", line 1587, in __request verbose=self.__verbose File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/supervisor-3.0-py2.7.egg/supervisor/xmlrpc.py", line 460, in request self.connection.request('POST', handler, request_body, self.headers) File "/usr/lib/python2.7/httplib.py", line 1017, in request self._send_request(method, url, body, headers) File "/usr/lib/python2.7/httplib.py", line 1051, in _send_request self.endheaders(body) File "/usr/lib/python2.7/httplib.py", line 1013, in endheaders self._send_output(message_body) File "/usr/lib/python2.7/httplib.py", line 864, in _send_output self.send(msg) File "/usr/lib/python2.7/httplib.py", line 826, in send self.connect() File "/usr/lib/python2.7/httplib.py", line 807, in connect self.timeout, self.source_address) File "/usr/lib/python2.7/socket.py", line 571, in create_connection raise err error: [Errno 111] Connection refused [25/Apr/2018 17:25:49 +0000] 3508 MainThread agent INFO Trying to connect to newly launched supervisor (Attempt 5) [25/Apr/2018 17:25:49 +0000] 3508 MainThread agent ERROR Failed! trying again in 1 second(s) Traceback (most recent call last): File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/agent.py", line 2208, in connect_to_new_supervisor self.get_supervisor_process_info() File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/agent.py", line 2230, in get_supervisor_process_info self.identifier = self.supervisor_client.supervisor.getIdentification() File "/usr/lib/python2.7/xmlrpclib.py", line 1233, in __call__ return self.__send(self.__name, args) File "/usr/lib/python2.7/xmlrpclib.py", line 1587, in __request verbose=self.__verbose File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/supervisor-3.0-py2.7.egg/supervisor/xmlrpc.py", line 460, in request self.connection.request('POST', handler, request_body, self.headers) File "/usr/lib/python2.7/httplib.py", line 1017, in request self._send_request(method, url, body, headers) File "/usr/lib/python2.7/httplib.py", line 1051, in _send_request self.endheaders(body) File "/usr/lib/python2.7/httplib.py", line 1013, in endheaders self._send_output(message_body) File "/usr/lib/python2.7/httplib.py", line 864, in _send_output self.send(msg) File "/usr/lib/python2.7/httplib.py", line 826, in send self.connect() File "/usr/lib/python2.7/httplib.py", line 807, in connect self.timeout, self.source_address) File "/usr/lib/python2.7/socket.py", line 571, in create_connection raise err error: [Errno 111] Connection refused [25/Apr/2018 17:25:49 +0000] 3508 MainThread agent ERROR Failed to connect to newly launched supervisor. Agent will exit [25/Apr/2018 17:25:49 +0000] 3508 MainThread agent INFO Stopping agent... [25/Apr/2018 17:25:49 +0000] 3508 MainThread agent INFO No extant cgroups; unmounting any cgroup roots [25/Apr/2018 17:26:17 +0000] 4015 MainThread agent INFO SCM Agent Version: 5.8.2 [25/Apr/2018 17:26:17 +0000] 4015 MainThread agent INFO Agent Protocol Version: 4 [25/Apr/2018 17:26:17 +0000] 4015 MainThread agent INFO Using Host ID: 842ca2b2-6e0a-422b-be27-1f8270defb60 [25/Apr/2018 17:26:17 +0000] 4015 MainThread agent INFO Using directory: /run/cloudera-scm-agent [25/Apr/2018 17:26:17 +0000] 4015 MainThread agent INFO Using supervisor binary path: /usr/lib/cmf/agent/build/env/bin/supervisord [25/Apr/2018 17:26:17 +0000] 4015 MainThread agent INFO Neither verify_cert_file nor verify_cert_dir are configured. Not performing validation of server certificates in HTTPS communication. These options can be configured in this agent's config.ini file to enable certificate validation. [25/Apr/2018 17:26:17 +0000] 4015 MainThread agent INFO Agent Logging Level: INFO [25/Apr/2018 17:26:17 +0000] 4015 MainThread agent INFO No command line vars [25/Apr/2018 17:26:17 +0000] 4015 MainThread agent INFO Found database jar: /usr/share/java/mysql-connector-java.jar [25/Apr/2018 17:26:17 +0000] 4015 MainThread agent INFO Missing database jar: /usr/share/java/oracle-connector-java.jar (normal, if you're not using this database type) [25/Apr/2018 17:26:17 +0000] 4015 MainThread agent INFO Found database jar: /usr/share/cmf/lib/postgresql-9.0-801.jdbc4.jar [25/Apr/2018 17:26:17 +0000] 4015 MainThread agent INFO Agent starting as pid 4015 user root(0) group root(0). [25/Apr/2018 17:26:19 +0000] 4015 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent/cgroups [25/Apr/2018 17:26:19 +0000] 4015 MainThread cgroups INFO Found cgroups subsystem: cpu [25/Apr/2018 17:26:19 +0000] 4015 MainThread cgroups INFO cgroup pseudofile /tmp/tmpodxfKH/cpu.rt_runtime_us does not exist, skipping [25/Apr/2018 17:26:19 +0000] 4015 MainThread cgroups INFO Found cgroups subsystem: cpuacct [25/Apr/2018 17:26:20 +0000] 4015 MainThread cgroups INFO Found cgroups subsystem: memory [25/Apr/2018 17:26:20 +0000] 4015 MainThread cgroups INFO Found cgroups subsystem: blkio [25/Apr/2018 17:26:20 +0000] 4015 MainThread cgroups INFO Reusing /run/cloudera-scm-agent/cgroups/memory [25/Apr/2018 17:26:20 +0000] 4015 MainThread cgroups INFO Reusing /run/cloudera-scm-agent/cgroups/cpu [25/Apr/2018 17:26:20 +0000] 4015 MainThread cgroups INFO Reusing /run/cloudera-scm-agent/cgroups/cpuacct [25/Apr/2018 17:26:20 +0000] 4015 MainThread cgroups INFO Reusing /run/cloudera-scm-agent/cgroups/blkio [25/Apr/2018 17:26:20 +0000] 4015 MainThread agent INFO Found cgroups capabilities: {'has_memory': True, 'default_memory_limit_in_bytes': -1, 'default_memory_soft_limit_in_bytes': -1, 'writable_cgroup_dot_procs': True, 'default_cpu_rt_runtime_us': -1, 'has_cpu': True, 'default_blkio_weight': 1000, 'default_cpu_shares': 1024, 'has_cpuacct': True, 'has_blkio': True} [25/Apr/2018 17:26:20 +0000] 4015 MainThread agent INFO Setting up supervisord event monitor. [25/Apr/2018 17:26:20 +0000] 4015 MainThread filesystem_map INFO Monitored nodev filesystem types: ['nfs', 'nfs4', 'tmpfs'] [25/Apr/2018 17:26:20 +0000] 4015 MainThread filesystem_map INFO Using timeout of 2.000000 [25/Apr/2018 17:26:20 +0000] 4015 MainThread filesystem_map INFO Using join timeout of 0.100000 [25/Apr/2018 17:26:20 +0000] 4015 MainThread filesystem_map INFO Using tolerance of 60.000000 [25/Apr/2018 17:26:20 +0000] 4015 MainThread filesystem_map INFO Local filesystem types whitelist: ['ext2', 'ext3', 'ext4'] [25/Apr/2018 17:26:20 +0000] 4015 MainThread kt_renewer INFO Agent wide credential cache set to /run/cloudera-scm-agent/krb5cc_cm_agent_0 [25/Apr/2018 17:26:20 +0000] 4015 MainThread agent INFO Using metrics_url_timeout_seconds of 30.000000 [25/Apr/2018 17:26:20 +0000] 4015 MainThread agent INFO Using task_metrics_timeout_seconds of 5.000000 [25/Apr/2018 17:26:20 +0000] 4015 MainThread agent INFO Using max_collection_wait_seconds of 10.000000 [25/Apr/2018 17:26:20 +0000] 4015 MainThread metrics INFO Importing tasktracker metric schema from file /usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/monitor/tasktracker/schema.json [25/Apr/2018 17:26:20 +0000] 4015 MainThread ntp_monitor INFO Using timeout of 2.000000 [25/Apr/2018 17:26:20 +0000] 4015 MainThread dns_names INFO Using timeout of 30.000000 [25/Apr/2018 17:26:20 +0000] 4015 MainThread __init__ INFO Created DNS monitor. [25/Apr/2018 17:26:20 +0000] 4015 MainThread stacks_collection_manager INFO Using max_uncompressed_file_size_bytes: 5242880 [25/Apr/2018 17:26:20 +0000] 4015 MainThread __init__ INFO Importing metric schema from file /usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/monitor/schema.json [25/Apr/2018 17:26:20 +0000] 4015 MainThread agent INFO Supervised processes will add the following to their environment (in addition to the supervisor's env): {'CDH_PARQUET_HOME': '/usr/lib/parquet', 'JSVC_HOME': '/usr/libexec/bigtop-utils', 'CMF_PACKAGE_DIR': '/usr/lib/cmf/service', 'CDH_HADOOP_BIN': '/usr/bin/hadoop', 'MGMT_HOME': '/usr/share/cmf', 'CDH_IMPALA_HOME': '/usr/lib/impala', 'CDH_YARN_HOME': '/usr/lib/hadoop-yarn', 'CDH_HDFS_HOME': '/usr/lib/hadoop-hdfs', 'PATH': '/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games', 'CDH_HUE_PLUGINS_HOME': '/usr/lib/hadoop', 'CM_STATUS_CODES': u'STATUS_NONE HDFS_DFS_DIR_NOT_EMPTY HBASE_TABLE_DISABLED HBASE_TABLE_ENABLED JOBTRACKER_IN_STANDBY_MODE YARN_RM_IN_STANDBY_MODE', 'KEYTRUSTEE_KP_HOME': '/usr/share/keytrustee-keyprovider', 'CLOUDERA_ORACLE_CONNECTOR_JAR': '/usr/share/java/oracle-connector-java.jar', 'CDH_SQOOP2_HOME': '/usr/lib/sqoop2', 'KEYTRUSTEE_SERVER_HOME': '/usr/lib/keytrustee-server', 'CDH_MR2_HOME': '/usr/lib/hadoop-mapreduce', 'HIVE_DEFAULT_XML': '/etc/hive/conf.dist/hive-default.xml', 'CLOUDERA_POSTGRESQL_JDBC_JAR': '/usr/share/cmf/lib/postgresql-9.0-801.jdbc4.jar', 'CDH_KMS_HOME': '/usr/lib/hadoop-kms', 'CDH_HBASE_HOME': '/usr/lib/hbase', 'CDH_SQOOP_HOME': '/usr/lib/sqoop', 'WEBHCAT_DEFAULT_XML': '/etc/hive-webhcat/conf.dist/webhcat-default.xml', 'CDH_OOZIE_HOME': '/usr/lib/oozie', 'CDH_ZOOKEEPER_HOME': '/usr/lib/zookeeper', 'CDH_HUE_HOME': '/usr/lib/hue', 'CLOUDERA_MYSQL_CONNECTOR_JAR': '/usr/share/java/mysql-connector-java.jar', 'CDH_HBASE_INDEXER_HOME': '/usr/lib/hbase-solr', 'CDH_MR1_HOME': '/usr/lib/hadoop-0.20-mapreduce', 'CDH_SOLR_HOME': '/usr/lib/solr', 'CDH_PIG_HOME': '/usr/lib/pig', 'CDH_SENTRY_HOME': '/usr/lib/sentry', 'CDH_CRUNCH_HOME': '/usr/lib/crunch', 'CDH_LLAMA_HOME': '/usr/lib/llama/', 'CDH_HTTPFS_HOME': '/usr/lib/hadoop-httpfs', 'CDH_HADOOP_HOME': '/usr/lib/hadoop', 'CDH_HIVE_HOME': '/usr/lib/hive', 'CDH_HCAT_HOME': '/usr/lib/hive-hcatalog', 'CDH_KAFKA_HOME': '/usr/lib/kafka', 'CDH_SPARK_HOME': '/usr/lib/spark', 'TOMCAT_HOME': '/usr/lib/bigtop-tomcat', 'CDH_FLUME_HOME': '/usr/lib/flume-ng'} [25/Apr/2018 17:26:20 +0000] 4015 MainThread agent INFO To override these variables, use /etc/cloudera-scm-agent/config.ini. Environment variables for CDH locations are not used when CDH is installed from parcels. [25/Apr/2018 17:26:20 +0000] 4015 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent/process [25/Apr/2018 17:26:20 +0000] 4015 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent/supervisor [25/Apr/2018 17:26:20 +0000] 4015 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent/flood [25/Apr/2018 17:26:20 +0000] 4015 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent/supervisor/include [25/Apr/2018 17:26:20 +0000] 4015 MainThread agent ERROR Failed to connect to previous supervisor. Traceback (most recent call last): File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/agent.py", line 2084, in find_or_start_supervisor self.get_supervisor_process_info() File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/agent.py", line 2230, in get_supervisor_process_info self.identifier = self.supervisor_client.supervisor.getIdentification() File "/usr/lib/python2.7/xmlrpclib.py", line 1233, in __call__ return self.__send(self.__name, args) File "/usr/lib/python2.7/xmlrpclib.py", line 1587, in __request verbose=self.__verbose File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/supervisor-3.0-py2.7.egg/supervisor/xmlrpc.py", line 460, in request self.connection.request('POST', handler, request_body, self.headers) File "/usr/lib/python2.7/httplib.py", line 1017, in request self._send_request(method, url, body, headers) File "/usr/lib/python2.7/httplib.py", line 1051, in _send_request self.endheaders(body) File "/usr/lib/python2.7/httplib.py", line 1013, in endheaders self._send_output(message_body) File "/usr/lib/python2.7/httplib.py", line 864, in _send_output self.send(msg) File "/usr/lib/python2.7/httplib.py", line 826, in send self.connect() File "/usr/lib/python2.7/httplib.py", line 807, in connect self.timeout, self.source_address) File "/usr/lib/python2.7/socket.py", line 571, in create_connection raise err error: [Errno 111] Connection refused [25/Apr/2018 17:26:20 +0000] 4015 MainThread tmpfs INFO Reusing mounted tmpfs at /run/cloudera-scm-agent/process [25/Apr/2018 17:26:21 +0000] 4015 MainThread agent INFO Trying to connect to newly launched supervisor (Attempt 1) [25/Apr/2018 17:26:21 +0000] 4015 MainThread agent ERROR Failed! trying again in 1 second(s) Traceback (most recent call last): File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/agent.py", line 2208, in connect_to_new_supervisor self.get_supervisor_process_info() File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/agent.py", line 2230, in get_supervisor_process_info self.identifier = self.supervisor_client.supervisor.getIdentification() File "/usr/lib/python2.7/xmlrpclib.py", line 1233, in __call__ return self.__send(self.__name, args) File "/usr/lib/python2.7/xmlrpclib.py", line 1587, in __request verbose=self.__verbose File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/supervisor-3.0-py2.7.egg/supervisor/xmlrpc.py", line 460, in request self.connection.request('POST', handler, request_body, self.headers) File "/usr/lib/python2.7/httplib.py", line 1017, in request self._send_request(method, url, body, headers) File "/usr/lib/python2.7/httplib.py", line 1051, in _send_request self.endheaders(body) File "/usr/lib/python2.7/httplib.py", line 1013, in endheaders self._send_output(message_body) File "/usr/lib/python2.7/httplib.py", line 864, in _send_output self.send(msg) File "/usr/lib/python2.7/httplib.py", line 826, in send self.connect() File "/usr/lib/python2.7/httplib.py", line 807, in connect self.timeout, self.source_address) File "/usr/lib/python2.7/socket.py", line 571, in create_connection raise err error: [Errno 111] Connection refused [25/Apr/2018 17:26:22 +0000] 4015 MainThread agent INFO Trying to connect to newly launched supervisor (Attempt 2) [25/Apr/2018 17:26:22 +0000] 4015 MainThread agent ERROR Failed! trying again in 1 second(s) Traceback (most recent call last): File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/agent.py", line 2208, in connect_to_new_supervisor self.get_supervisor_process_info() File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/agent.py", line 2230, in get_supervisor_process_info self.identifier = self.supervisor_client.supervisor.getIdentification() File "/usr/lib/python2.7/xmlrpclib.py", line 1233, in __call__ return self.__send(self.__name, args) File "/usr/lib/python2.7/xmlrpclib.py", line 1587, in __request verbose=self.__verbose File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/supervisor-3.0-py2.7.egg/supervisor/xmlrpc.py", line 460, in request self.connection.request('POST', handler, request_body, self.headers) File "/usr/lib/python2.7/httplib.py", line 1017, in request self._send_request(method, url, body, headers) File "/usr/lib/python2.7/httplib.py", line 1051, in _send_request self.endheaders(body) File "/usr/lib/python2.7/httplib.py", line 1013, in endheaders self._send_output(message_body) File "/usr/lib/python2.7/httplib.py", line 864, in _send_output self.send(msg) File "/usr/lib/python2.7/httplib.py", line 826, in send self.connect() File "/usr/lib/python2.7/httplib.py", line 807, in connect self.timeout, self.source_address) File "/usr/lib/python2.7/socket.py", line 571, in create_connection raise err error: [Errno 111] Connection refused [25/Apr/2018 17:26:23 +0000] 4015 MainThread agent INFO Trying to connect to newly launched supervisor (Attempt 3) [25/Apr/2018 17:26:23 +0000] 4015 MainThread agent ERROR Failed! trying again in 1 second(s) Traceback (most recent call last): File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/agent.py", line 2208, in connect_to_new_supervisor self.get_supervisor_process_info() File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/agent.py", line 2230, in get_supervisor_process_info self.identifier = self.supervisor_client.supervisor.getIdentification() File "/usr/lib/python2.7/xmlrpclib.py", line 1233, in __call__ return self.__send(self.__name, args) File "/usr/lib/python2.7/xmlrpclib.py", line 1587, in __request verbose=self.__verbose File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/supervisor-3.0-py2.7.egg/supervisor/xmlrpc.py", line 460, in request self.connection.request('POST', handler, request_body, self.headers) File "/usr/lib/python2.7/httplib.py", line 1017, in request self._send_request(method, url, body, headers) File "/usr/lib/python2.7/httplib.py", line 1051, in _send_request self.endheaders(body) File "/usr/lib/python2.7/httplib.py", line 1013, in endheaders self._send_output(message_body) File "/usr/lib/python2.7/httplib.py", line 864, in _send_output self.send(msg) File "/usr/lib/python2.7/httplib.py", line 826, in send self.connect() File "/usr/lib/python2.7/httplib.py", line 807, in connect self.timeout, self.source_address) File "/usr/lib/python2.7/socket.py", line 571, in create_connection raise err error: [Errno 111] Connection refused [25/Apr/2018 17:26:24 +0000] 4015 MainThread agent INFO Trying to connect to newly launched supervisor (Attempt 4) [25/Apr/2018 17:26:24 +0000] 4015 MainThread agent ERROR Failed! trying again in 1 second(s) Traceback (most recent call last): File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/agent.py", line 2208, in connect_to_new_supervisor self.get_supervisor_process_info() File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/agent.py", line 2230, in get_supervisor_process_info self.identifier = self.supervisor_client.supervisor.getIdentification() File "/usr/lib/python2.7/xmlrpclib.py", line 1233, in __call__ return self.__send(self.__name, args) File "/usr/lib/python2.7/xmlrpclib.py", line 1587, in __request verbose=self.__verbose File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/supervisor-3.0-py2.7.egg/supervisor/xmlrpc.py", line 460, in request self.connection.request('POST', handler, request_body, self.headers) File "/usr/lib/python2.7/httplib.py", line 1017, in request self._send_request(method, url, body, headers) File "/usr/lib/python2.7/httplib.py", line 1051, in _send_request self.endheaders(body) File "/usr/lib/python2.7/httplib.py", line 1013, in endheaders self._send_output(message_body) File "/usr/lib/python2.7/httplib.py", line 864, in _send_output self.send(msg) File "/usr/lib/python2.7/httplib.py", line 826, in send self.connect() File "/usr/lib/python2.7/httplib.py", line 807, in connect self.timeout, self.source_address) File "/usr/lib/python2.7/socket.py", line 571, in create_connection raise err error: [Errno 111] Connection refused [25/Apr/2018 17:26:25 +0000] 4015 MainThread agent INFO Trying to connect to newly launched supervisor (Attempt 5) [25/Apr/2018 17:26:25 +0000] 4015 MainThread agent ERROR Failed! trying again in 1 second(s) Traceback (most recent call last): File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/agent.py", line 2208, in connect_to_new_supervisor self.get_supervisor_process_info() File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/agent.py", line 2230, in get_supervisor_process_info self.identifier = self.supervisor_client.supervisor.getIdentification() File "/usr/lib/python2.7/xmlrpclib.py", line 1233, in __call__ return self.__send(self.__name, args) File "/usr/lib/python2.7/xmlrpclib.py", line 1587, in __request verbose=self.__verbose File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/supervisor-3.0-py2.7.egg/supervisor/xmlrpc.py", line 460, in request self.connection.request('POST', handler, request_body, self.headers) File "/usr/lib/python2.7/httplib.py", line 1017, in request self._send_request(method, url, body, headers) File "/usr/lib/python2.7/httplib.py", line 1051, in _send_request self.endheaders(body) File "/usr/lib/python2.7/httplib.py", line 1013, in endheaders self._send_output(message_body) File "/usr/lib/python2.7/httplib.py", line 864, in _send_output self.send(msg) File "/usr/lib/python2.7/httplib.py", line 826, in send self.connect() File "/usr/lib/python2.7/httplib.py", line 807, in connect self.timeout, self.source_address) File "/usr/lib/python2.7/socket.py", line 571, in create_connection raise err error: [Errno 111] Connection refused [25/Apr/2018 17:26:25 +0000] 4015 MainThread agent ERROR Failed to connect to newly launched supervisor. Agent will exit [25/Apr/2018 17:26:25 +0000] 4015 MainThread agent INFO Stopping agent... [25/Apr/2018 17:26:25 +0000] 4015 MainThread agent INFO No extant cgroups; unmounting any cgroup roots
I also found this output in /var/log/cloudera-scm-agent/supervisord.out. Apparently the supervisord is throwing an import error because of the module diewithparent.
Traceback (most recent call last): File "/usr/lib/cmf/agent/build/env/bin/supervisord", line 12, in <module> load_entry_point('supervisor==3.0', 'console_scripts', 'supervisord')() File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/pkg_resources/__init__.py", line 558, in load_entry_point return get_distribution(dist).load_entry_point(group, name) File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2682, in load_entry_point return ep.load() File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2355, in load return self.resolve() File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2361, in resolve module = __import__(self.module_name, fromlist=['__name__'], level=0) File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/supervisor-3.0-py2.7.egg/supervisor/supervisord.py", line 41, in <module> from supervisor.options import ServerOptions File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/supervisor-3.0-py2.7.egg/supervisor/options.py", line 24, in <module> import diewithparent ImportError: No module named diewithparent
Created 04-26-2018 02:01 AM
Hi @brandonvin
Try to run
service cloudera-scm-agent hard_stop_confirmed
service cloudera-scm-agent start
Or
service cloudera-scm-agent hard_restart
Good luck.
Created 07-31-2019 06:45 PM