Created on 08-16-2017 01:31 AM - edited 09-16-2022 05:05 AM
Hi, All
I was upgrading my dev cluser containing 5 hosts from 5.11.0 to 5.12.0
I have 4 nodes with hdfs/yarn modiles running and one host which is in cluser but it is running the cloudera agent only.
4 hdfs/yarn hosts were able to upgrade to 5.12 flawlessly.
However I had to upgrade the host with agent only manually.
What I did was essentially the following:
1. Stopped the 5.11 agent service
2. Downloaded the 5.12 agent and daemon RPMs
3. Removed the old agent
4. Installed new RPMs
5. Corrected the config.ini to point onto the proper server_host
6. Started the agent
And if friggin' failed with the following error message :
[15/Aug/2017 18:57:37 +0000] 51524 MainThread kt_renewer INFO Agent wide credential cache set to /var/run/cloudera-scm-agent/krb5cc_cm_agent_0 [15/Aug/2017 18:57:37 +0000] 51524 MainThread agent INFO Using metrics_url_timeout_seconds of 30.000000 [15/Aug/2017 18:57:37 +0000] 51524 MainThread agent INFO Using task_metrics_timeout_seconds of 5.000000 [15/Aug/2017 18:57:37 +0000] 51524 MainThread agent INFO Using max_collection_wait_seconds of 10.000000 [15/Aug/2017 18:57:37 +0000] 51524 MainThread metrics INFO Importing tasktracker metric schema from file /usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.12.0-py2.6.egg/cmf/monitor/tasktracker/schema.json [15/Aug/2017 18:57:38 +0000] 51524 MainThread tcp_metrics WARNING File '/proc/net/tcp6' couldn't be opened for tcp statistic collection, error=2 [15/Aug/2017 18:57:38 +0000] 51524 MainThread ntp_monitor INFO Using timeout of 2.000000 [15/Aug/2017 18:57:38 +0000] 51524 MainThread dns_names INFO Using timeout of 30.000000 [15/Aug/2017 18:57:38 +0000] 51524 MainThread __init__ INFO Created DNS monitor. [15/Aug/2017 18:57:38 +0000] 51524 MainThread stacks_collection_manager INFO Using max_uncompressed_file_size_bytes: 5242880 [15/Aug/2017 18:57:38 +0000] 51524 MainThread __init__ INFO Importing metric schema from file /usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.12.0-py2.6.egg/cmf/monitor/schema.json [15/Aug/2017 18:57:39 +0000] 51524 MainThread agent INFO Supervised processes will add the following to their environment (in addition to the supervisor's env): {'CDH_PARQUET_HOME': '/usr/lib/parquet', 'JSVC_HOME': '/usr/libexec/bigtop-utils', 'CMF_PACKAGE_DIR': '/usr/lib64/cmf/service', 'CDH_HADOOP_BIN': '/usr/bin/hadoop', 'MGMT_HOME': '/usr/share/cmf', 'CDH_IMPALA_HOME': '/usr/lib/impala', 'CDH_YARN_HOME': '/usr/lib/hadoop-yarn', 'CDH_HDFS_HOME': '/usr/lib/hadoop-hdfs', 'PATH': '/sbin:/usr/sbin:/bin:/usr/bin', 'CDH_HUE_PLUGINS_HOME': '/usr/lib/hadoop', 'CM_STATUS_CODES': u'STATUS_NONE HDFS_DFS_DIR_NOT_EMPTY HBASE_TABLE_DISABLED HBASE_TABLE_ENABLED JOBTRACKER_IN_STANDBY_MODE YARN_RM_IN_STANDBY_MODE', 'KEYTRUSTEE_KP_HOME': '/usr/share/keytrustee-keyprovider', 'CLOUDERA_ORACLE_CONNECTOR_JAR': '/usr/share/java/oracle-connector-java.jar', 'CDH_SQOOP2_HOME': '/usr/lib/sqoop2', 'KEYTRUSTEE_SERVER_HOME': '/usr/lib/keytrustee-server', 'CDH_MR2_HOME': '/usr/lib/hadoop-mapreduce', 'HIVE_DEFAULT_XML': '/etc/hive/conf.dist/hive-default.xml', 'CLOUDERA_POSTGRESQL_JDBC_JAR': '/usr/share/cmf/lib/postgresql-9.0-801.jdbc4.jar', 'CDH_KMS_HOME': '/usr/lib/hadoop-kms', 'CDH_HBASE_HOME': '/usr/lib/hbase', 'CDH_SQOOP_HOME': '/usr/lib/sqoop', 'WEBHCAT_DEFAULT_XML': '/etc/hive-webhcat/conf.dist/webhcat-default.xml', 'CDH_OOZIE_HOME': '/usr/lib/oozie', 'CDH_ZOOKEEPER_HOME': '/usr/lib/zookeeper', 'CDH_HUE_HOME': '/usr/lib/hue', 'CLOUDERA_MYSQL_CONNECTOR_JAR': '/usr/share/java/mysql-connector-java.jar', 'CDH_HBASE_INDEXER_HOME': '/usr/lib/hbase-solr', 'CDH_MR1_HOME': '/usr/lib/hadoop-0.20-mapreduce', 'CDH_SOLR_HOME': '/usr/lib/solr', 'CDH_PIG_HOME': '/usr/lib/pig', 'CDH_SENTRY_HOME': '/usr/lib/sentry', 'CDH_CRUNCH_HOME': '/usr/lib/crunch', 'CDH_LLAMA_HOME': '/usr/lib/llama/', 'CDH_HTTPFS_HOME': '/usr/lib/hadoop-httpfs', 'CDH_HADOOP_HOME': '/usr/lib/hadoop', 'CDH_HIVE_HOME': '/usr/lib/hive', 'ORACLE_HOME': '/usr/share/oracle/instantclient', 'CDH_HCAT_HOME': '/usr/lib/hive-hcatalog', 'CDH_KAFKA_HOME': '/usr/lib/kafka', 'CDH_SPARK_HOME': '/usr/lib/spark', 'TOMCAT_HOME': '/usr/lib/bigtop-tomcat', 'CDH_FLUME_HOME': '/usr/lib/flume-ng'} [15/Aug/2017 18:57:39 +0000] 51524 MainThread agent INFO To override these variables, use /etc/cloudera-scm-agent/config.ini. Environment variables for CDH locations are not used when CDH is installed from parcels. [15/Aug/2017 18:57:39 +0000] 51524 MainThread agent INFO Re-using pre-existing directory: /var/run/cloudera-scm-agent/process [15/Aug/2017 18:57:39 +0000] 51524 MainThread agent INFO Re-using pre-existing directory: /var/run/cloudera-scm-agent/supervisor [15/Aug/2017 18:57:39 +0000] 51524 MainThread agent INFO Re-using pre-existing directory: /var/run/cloudera-scm-agent/flood [15/Aug/2017 18:57:39 +0000] 51524 MainThread agent INFO Re-using pre-existing directory: /var/run/cloudera-scm-agent/supervisor/include [15/Aug/2017 18:57:39 +0000] 51524 MainThread agent INFO Conf directory: /var/run/cloudera-scm-agent/supervisor [15/Aug/2017 18:57:39 +0000] 51524 MainThread agent ERROR Failed to connect to previous supervisor. Traceback (most recent call last): File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.12.0-py2.6.egg/cmf/agent.py", line 2109, in find_or_start_supervisor self.configure_supervisor_clients() File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.12.0-py2.6.egg/cmf/agent.py", line 2291, in configure_supervisor_clients supervisor_options.realize(args=["-c", os.path.join(self.supervisor_dir, "supervisord.conf")]) File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/supervisor-3.0-py2.6.egg/supervisor/options.py", line 1599, in realize Options.realize(self, *arg, **kw) File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/supervisor-3.0-py2.6.egg/supervisor/options.py", line 333, in realize self.process_config() File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/supervisor-3.0-py2.6.egg/supervisor/options.py", line 341, in process_config self.process_config_file(do_usage) File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/supervisor-3.0-py2.6.egg/supervisor/options.py", line 376, in process_config_file self.usage(str(msg)) File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/supervisor-3.0-py2.6.egg/supervisor/options.py", line 164, in usage self.exit(2) SystemExit: 2 [15/Aug/2017 18:57:39 +0000] 51524 Dummy-1 daemonize WARNING Stopping daemon. [15/Aug/2017 18:57:39 +0000] 51524 Dummy-1 agent INFO Stopping agent... [15/Aug/2017 18:57:39 +0000] 51524 Dummy-1 agent INFO No extant cgroups; unmounting any cgroup roots
The error " Failed to connect to previous supervisor" alppeared, similar to the one in https://community.cloudera.com/t5/Cloudera-Manager-Installation/Failed-to-connect-to-previous-superv...
The question is -- how do I upgrade the node to the proper 5.12 version agent?
P.S. Finally I was able to overcome the issue temporarily by downgrade the agent to 5.11.1. Please note the 5.11.1. When I tried to downgrade to 5.11.0 which is original version the node disappeared from the cluster.
But I need the final solution here :)))
Created 08-16-2017 06:52 AM
Quote: "...this issue is going to be fixed in later releases of 5.12 ?"
Yes, the issue is already fixed and committed, we aim to have it available in the next 5.12.x maintenance release.
Created 09-15-2017 01:07 AM
Quote: "Do you know when the new 5.12.x maintenance release will be available?"
It's available now; [ANNOUNCE] Cloudera Enterprise 5.12.1 Released [0]
Created 08-16-2017 05:47 AM
Hi dekan,
It looks like you're running into the same issue as the following fellow community members [0]
Can you try the workaround in [1]
Created on 08-16-2017 06:01 AM - edited 08-16-2017 06:02 AM
Thank you, Michalis
This looks exactly like my case!
Unfortunately I can't use the workaround from [1] provided as there is some other stuff running in the host which uses tmpfs.
Do I understand correctly this issue is going to be fixed in later releases of 5.12 ?
Created 08-16-2017 06:52 AM
Quote: "...this issue is going to be fixed in later releases of 5.12 ?"
Yes, the issue is already fixed and committed, we aim to have it available in the next 5.12.x maintenance release.
Created 08-22-2017 05:13 PM
Michalis,
Do you know when the new 5.12.x maintenance release will be available? I have this problem and would like to see it fixed. However, I can't wait too long as I have nodes not connected to my CM. How long away are we from this release?
Kevin
Created 09-15-2017 01:07 AM
Quote: "Do you know when the new 5.12.x maintenance release will be available?"
It's available now; [ANNOUNCE] Cloudera Enterprise 5.12.1 Released [0]