Created on 01-18-2019 07:51 AM - edited 01-18-2019 07:51 AM
Hi,
After I have upgraded to CM/CDH 6.1 from 5.16.1, my hosts randomly and periodically having "Unknown Health" for about a few seconds and then go back to green. I have not seen/found any WARNING nor any ERROR in any logs from hosts or any services.
The entire cluster works without any issue, I have run host inspection and network inspection without any problem. Also, synced time/date a few times just in case but still I can watch my hosts (also services because of the hosts) going grey with "Unknown Health" and back to green randomly for few seconds.
Cloudera Management Service is on one server with 14 cores and 28G memory. I have checked this server activity, it is pretty idle, so the cluster is not a busy cluster. Either way, this is the heap size for the monitorings:
Java Heap Size of Activity Monitor in Bytes: 2GB
Do you guys have any advice on how to diagnose the possible issue?
Many thanks.
Created 02-28-2019 03:22 AM
Created 02-21-2019 07:38 AM
How did you do the upgrade? For me it's impossible to initialize any service. I'm trying the same as you: upgrading from 5.16.1 to version 6.1. This is my error:
0', u'expected_exitcodes': [], u'run_generation': 2, u'start_timeout_seconds': None, u'optional_tags': [u'hdfs-client-plugin', u'sentry-plugin'], u'parcels': {u'SPARK2': u'2.2.0.cloudera2-1.cdh5.12.0.p0.232957', u'CDH': u'5.16.1-1.cdh5.16.1.p0.3', u'KAFKA': u'3.1.0-1.3.1.0.p0.35'}}, {u'refresh_files': [], u'config_generation': 0, u'auto_restart': False, u'one_off': True, u'special_file_info': [], u'id': 8272, u'status_links': {}, u'extra_groups': [], u'environment': {u'HOST_STATISTICS_DIR': u'clouderapre-mgr.fintonic.com-caa7e6d9-d5ae-4ece-bb89-bc13333c9aa9-10.0.199.102-host-statistics', u'CM_HOST_NAME': u'clouderapre-mgr.fintonic.com', u'SET_PYTHON_PATH': u'true', u'TIMEOUT': u'60', u'REDACTION_RULES_FILE': u'redaction-rules.json', u'JAVA_HOME': u'/usr/java/jdk1.8.0_151'}, u'program': u'support/collect_host_stats.sh', u'arguments': [], u'resources': [], u'running': False, u'required_tags': [], u'user': u'root', u'group': u'root', u'name': u'collect-host-statistics', u'configuration_data': 'PK\x03\x04\x14\x00\x08\x08\x08\x00\x80"UN\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x14\x00\x00\x00redaction-rules.json\xad\x94AO\xc3 \x18\x86\xef\xfb\x15\x84\xa3\xab\x07\xafMv\x98v7\xa3f\xd3xh\xbb\x84\xb4\xdf&\xae\xb6\x0b\xe0\x8cY\xf6\xdf\x85\xe2`Y\xa8a`\x0f\xe4\xfbxy\xe9\xfb\x84\xc0~\x84\x10\xde\x01\xe3\xb4kq\x8an\x12\xd5\xb3\xcf\x06\xb8\xecr\xd9 \xb4\xefG9]\x03\xaf\x18\xdd\n\xbd\x14\xcf\xa1&\x95@[\xc2\xf9W\xc7j\x8eV\xac\xfb@\xef\xbck\xd1\x8a\xaa\r\x92\xa3\xb1"\x1c\x16\xd0r*\xe8\x0e\xa4uE\x1a\x0eF\x15\x8c\xae\xd7\xc0\xd4\x96\xc7\xbd\xac\x95\x03a\xd5\x9b\xd2\n\xa3\x168G\xe5U\xaa\x06Y.\x0b\\\x8e\x0bl-\x0c\xb6\r\xa9\xe0\xdc\x93\xa2\x02\xdf\xbe<d\xf7\xb3\xeb\xf9,\x9b\xde=\xcf2\xe9\xeaM\x87\xe4\x12\xd0\t"mm\xba\xf4\x9f1\x8fZ\x9eN\xca|)3\x17\xf2+\xc7N<\x93\xe8\x8c\xebb\xaaS\xa6p\xa2a\x1e\x7f\x9ah\x16K\x12\xca1D\xe1\xcb\x10C\xf04],^\x1f\xe7Y\x12\x90\xdd\xe1\xb5\x00FT\xf7\xe5\x0f\x00\xb3\x0eEPl\xe0[\x1f\x83,BNA\xda\\\x0cr\xda\xe7\x0c\xd4\xdf#\xc2s\xa8\x18\x08\x9d_\xd7!\x08\xda\xe9\xa2\xd0\x8a\x0f\xc8o\x92\x08\x16\xe9\xaf\xa1\x15\x944\x9a\xc7\xf6!L\xd6\xed\xe2\xb2\xaa\x0f\xdbI\xb2\x08>\xd1m\xa0\xd5h}\x19B\xd5\x1b]@\xbd\xe0\xc3\xa2S\xc4\xdd\x17Z\x9b\x1bC\x83^\xe0\xde8pk\xa8\xd7\xfb\xabS\xb81\xe4X\x8e\x0e\xa3\x1fPK\x07\x08\xb9\x86\x81Pp\x01\x00\x00.\x08\x00\x00PK\x01\x02\x14\x00\x14\x00\x08\x08\x08\x00\x80"UN\xb9\x86\x81Pp\x01\x00\x00.\x08\x00\x00\x14\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00redaction-rules.jsonPK\x05\x06\x00\x00\x00\x00\x01\x00\x01\x00B\x00\x00\x00\xb2\x01\x00\x00\x00\x00', u'expected_exitcodes': [], u'run_generation': 2, u'start_timeout_seconds': None, u'optional_tags': [], u'parcels': {}}], u'server_manages_parcels': True, u'heartbeat_interval': 15, u'parcels_directory': u'/opt/cloudera/parcels', u'host_id': u'caa7e6d9-d5ae-4ece-bb89-bc13333c9aa9', u'cm_guid': u'1e8e8289-ab7a-4a18-b08e-06b9afac0d73', u'eventserver_host': u'clouderapre-node1.fintonic.com', u'enabled_metric_reporters': [u'SPARK_ON_YARN-SPARK_YARN_HISTORY_SERVER', u'SPARK_YARN_HISTORY_SERVER', u'HBASE-HBASERESTSERVER', u'HBASERESTSERVER', u'SPARK', u'SPARK', u'ACCUMULO_C6-ACCUMULO_GC', u'ACCUMULO_GC', u'HBASE', u'HBASE', u'MGMT-EVENTSERVER', u'EVENTSERVER', u'KAFKA-KAFKA_BROKER', u'KAFKA_BROKER', u'HBASE-REGIONSERVER', u'REGIONSERVER', u'LUNA_KMS-HSMKP_LUNA', u'HSMKP_LUNA', u'THALES_KMS', u'THALES_KMS', u'IMPALA-IMPALAD', u'IMPALAD', u'AUTH-AUTHSERVER', u'AUTHSERVER', u'ISILON', u'ISILON', u'YARN-NODEMANAGER', u'NODEMANAGER', u'MAPREDUCE', u'MAPREDUCE', u'ACCUMULO16-ACCUMULO16_TRACER', u'ACCUMULO16_TRACER', u'KMS', u'KMS', u'ACCUMULO16-ACCUMULO16_MONITOR', u'ACCUMULO16_MONITOR', u'YARN-JOBHISTORY', u'JOBHISTORY', u'KEYTRUSTEE', u'KEYTRUSTEE', u'HDFS-JOURNALNODE', u'JOURNALNODE', u'KAFKA', u'KAFKA', u'SPARK-SPARK_HISTORY_SERVER', u'SPARK_HISTORY_SERVER', u'HDFS-NAMENODE', u'NAMENODE', u'MAPREDUCE-TASKTRACKER', u'TASKTRACKER', u'IMPALA-CATALOGSERVER', u'CATALOGSERVER', u'SPARK2_ON_YARN', u'SPARK2_ON_YARN', u'KUDU-KUDU_MASTER', u'KUDU_MASTER', u'LUNA_KMS', u'LUNA_KMS', u'HDFS-DSSDDATANODE', u'DSSDDATANODE', u'SENTRY', u'SENTRY', u'ACCUMULO16-ACCUMULO16_GC', u'ACCUMULO16_GC', u'MGMT-NAVIGATOR', u'NAVIGATOR', u'MGMT-TELEMETRYPUBLISHER', u'TELEMETRYPUBLISHER', u'HIVE', u'HIVE', u'ACCUMULO_C6-ACCUMULO_MASTER', u'ACCUMULO_MASTER', u'SQOOP-SQOOP_SERVER', u'SQOOP_SERVER', u'KAFKA-KAFKA_MIRROR_MAKER', u'KAFKA_MIRROR_MAKER', u'KUDU', u'KUDU', u'ACCUMULO16-ACCUMULO16_MASTER', u'ACCUMULO16_MASTER', u'OOZIE', u'OOZIE', u'SQOOP_CLIENT', u'SQOOP_CLIENT', u'OOZIE-OOZIE_SERVER', u'OOZIE_SERVER', u'HDFS-FAILOVERCONTROLLER', u'FAILOVERCONTROLLER', u'YARN', u'YARN', u'HDFS-NFSGATEWAY', u'NFSGATEWAY', u'HDFS-HTTPFS', u'HTTPFS', u'HUE-KT_RENEWER', u'KT_RENEWER', u'KEYTRUSTEE_SERVER-DB_ACTIVE', u'DB_ACTIVE', u'KS_INDEXER-HBASE_INDEXER', u'HBASE_INDEXER', u'ACCUMULO_C6-ACCUMULO_TSERVER', u'ACCUMULO_TSERVER', u'ACCUMULO16', u'ACCUMULO16', u'KEYTRUSTEE-KMS_KEYTRUSTEE', u'KMS_KEYTRUSTEE', u'SOLR-SOLR_SERVER', u'SOLR_SERVER', u'HOST', u'KEYTRUSTEE_SERVER-KEYTRUSTEE_PASSIVE_SERVER', u'KEYTRUSTEE_PASSIVE_SERVER', u'IMPALA-STATESTORE', u'STATESTORE', u'HDFS-DATANODE', u'DATANODE', u'YARN-RESOURCEMANAGER', u'RESOURCEMANAGER', u'HUE-HUE_SERVER', u'HUE_SERVER', u'MGMT-NAVIGATORMETASERVER', u'NAVIGATORMETASERVER', u'HBASE-MASTER', u'MASTER', u'KEYTRUSTEE_SERVER-DB_PASSIVE', u'DB_PASSIVE', u'SPARK_ON_YARN', u'SPARK_ON_YARN', u'SPARK2_ON_YARN-SPARK2_YARN_HISTORY_SERVER', u'SPARK2_YARN_HISTORY_SERVER', u'MGMT-REPORTSMANAGER', u'REPORTSMANAGER', u'MGMT-SERVICEMONITOR', u'SERVICEMONITOR', u'MGMT-ALERTPUBLISHER', u'ALERTPUBLISHER', u'HIVE-HIVESERVER2', u'HIVESERVER2', u'MGMT-ACTIVITYMONITOR', u'ACTIVITYMONITOR', u'AUTH-AUTH_LOAD_BALANCER', u'AUTH_LOAD_BALANCER', u'MAPREDUCE-FAILOVERCONTROLLER', u'FAILOVERCONTROLLER', u'ZOOKEEPER', u'ZOOKEEPER', u'MGMT-HOSTMONITOR', u'HOSTMONITOR', u'AUTH', u'AUTH', u'IMPALA', u'IMPALA', u'KEYTRUSTEE_SERVER-KEYTRUSTEE_ACTIVE_SERVER', u'KEYTRUSTEE_ACTIVE_SERVER', u'SOLR', u'SOLR', u'ACCUMULO_C6', u'ACCUMULO_C6', u'ACCUMULO_C6-ACCUMULO_TRACER', u'ACCUMULO_TRACER', u'ACCUMULO16-ACCUMULO16_TSERVER', u'ACCUMULO16_TSERVER', u'LUNA_KMS-HSMKP_METASTORE_LUNA', u'HSMKP_METASTORE_LUNA', u'HBASE-HBASETHRIFTSERVER', u'HBASETHRIFTSERVER', u'ACCUMULO_C6-ACCUMULO_MONITOR', u'ACCUMULO_MONITOR', u'FLUME', u'FLUME', u'HUE', u'HUE', u'HDFS-SECONDARYNAMENODE', u'SECONDARYNAMENODE', u'SENTRY-SENTRY_SERVER', u'SENTRY_SERVER', u'THALES_KMS-HSMKP_METASTORE_THALES', u'HSMKP_METASTORE_THALES', u'HIVE-HIVEMETASTORE', u'HIVEMETASTORE', u'IMPALA-LLAMA', u'LLAMA', u'SPARK-SPARK_WORKER', u'SPARK_WORKER', u'MGMT', u'MGMT', u'HIVE-WEBHCAT', u'WEBHCAT', u'SQOOP', u'SQOOP', u'HUE-HUE_LOAD_BALANCER', u'HUE_LOAD_BALANCER', u'FLUME-AGENT', u'AGENT', u'HDFS', u'HDFS', u'KUDU-KUDU_TSERVER', u'KUDU_TSERVER', u'KMS-KMS', u'KMS', u'KS_INDEXER', u'KS_INDEXER', u'SPARK-SPARK_MASTER', u'SPARK_MASTER', u'ZOOKEEPER-SERVER', u'SERVER', u'KEYTRUSTEE_SERVER', u'KEYTRUSTEE_SERVER', u'MAPREDUCE-JOBTRACKER', u'JOBTRACKER', u'THALES_KMS-HSMKP_THALES', u'HSMKP_THALES'], u'flood_seed_timeout': 100, u'eventserver_port': 7185}
Traceback (most recent call last):
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/agent.py", line 1528, in handle_heartbeat_response
self._handle_heartbeat_response(response)
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/agent.py", line 1661, in _handle_heartbeat_response
self._update_parcel_activation_state(response)
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/agent.py", line 1600, in _update_parcel_activation_state
manage_new_parcels)
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/parcel.py", line 640, in configure_all_symlinks
self.ensure_active_symlink(prod[version], False)
KeyError: '5.15.1-1.cdh5.15.1.p0.4'
Created 02-28-2019 03:22 AM