Created 12-23-2016 07:20 AM
Hi All,
I have a new install of CDH 5.9.0 with 6 nodes ( include manager server) using ec2 (redhat 7) with HA and it ran fine for few day but today it failed on startup only manager server has green light from manager console. From the configuration I see message as "Mismatched CDH versions: host has NONE but role expect 5" for all the services. The CDH Version show "None" for all the hosts except manager server. The cloudera-scm-agent.log is in loop of following. Please help and thank you very much.
Garry
From cloudera-scm-agent.log
[23/Dec/2016 10:06:03 +0000] 13378 MainThread agent INFO CM server guid: c26d94d1-eea0-4bce-ac52-989e95181029
[23/Dec/2016 10:06:03 +0000] 13378 MainThread agent INFO Using parcels directory from server provided value: /opt/cloudera/parcels
.
.
[23/Dec/2016 10:06:04 +0000] 13378 MainThread parcel INFO Executing command ['chmod', '0750', u'/opt/cloudera/parcels/CDH-5.9.0-1.cdh5.9.0.p0.23/etc/oozie/tomcat-conf.http.mr1/conf/ssl/ssl-server.xml']
[23/Dec/2016 10:06:04 +0000] 13378 MainThread parcel_cache INFO Using /opt/cloudera/parcel-cache for parcel cache
[23/Dec/2016 10:06:04 +0000] 13378 MainThread agent ERROR Caught unexpected exception in main loop.
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.9.0-py2.7.egg/cmf/agent.py", line 758, in start
self._init_after_first_heartbeat_response(resp_data)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.9.0-py2.7.egg/cmf/agent.py", line 938, in _init_after_first_heartbeat_response
self.client_configs.load()
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.9.0-py2.7.egg/cmf/client_configs.py", line 682, in load
new_deployed.update(self._lookup_alternatives(fname))
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.9.0-py2.7.egg/cmf/client_configs.py", line 432, in _lookup_alternatives
return self._parse_alternatives(alt_name, out)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.9.0-py2.7.egg/cmf/client_configs.py", line 444, in _parse_alternatives
path, _, _, priority_str = line.rstrip().split(" ")
ValueError: too many values to unpack
Created 12-30-2016 07:51 AM
Hi Garry,
I ran into the same problem.
It relates to https://community.cloudera.com/t5/Cloudera-Manager-Installation/Problem-with-cloudera-agent/td-p/476....
solution remove other java versions.
In my case I had to removed 'java-1.8.0-openjdk-headless' which was installed by package R.
Hope this helps
Created 12-30-2016 08:06 AM
The problem accord after R package got installed. Would R still work after remove 'java-1.8.0-openjdk-headless'?
Garry
Created 01-03-2017 07:38 AM
Created 01-03-2017 08:26 AM
Hi MrBee,
I go rJava installed by export the Java_home pointed to jdk1.7.0_67-cloudera directory and run
R CMD javareconf to set or configured the java environment. After that I can install rJava and Rhipe 0.75.2 but getting erro when run rhinit() inside R.
> rhinit()
Rhipe: Using Rhipe.jar file
Initializing Rhipe v0.75.2
Error in .jnew("org/godhuli/rhipe/PersonalServer") :
java.lang.NoClassDefFoundErrot: org/apache/commons/logging/LogFactory
Created 01-03-2017 09:01 AM
Hello,
I just tested a proposed fix that I think is pretty safe if you need to get by this problem in the agent without uninstalling OpenJDK.
DISCLAIMER:
This code change worked properly in my personal testbed and I think it is sound, but Cloudera engineering will need to confirm or provide an altenate fix. Use at your own risk.
Possible code fix for the agent issue. You can perform these steps on the agent host where the problem occurs:
(1)
Find your "client_configs.py" file. For 5.9.0 it will be:
/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.9.0-py2.7.egg/cmf/client_configs.py
The version number after the "cmf-" in cmf-5.9.0-py2.7.egg will reflect your version number. If you are on 5.8.3 it will look like this:
/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.3-py2.7.egg/cmf/client_configs.py
(2)
Back up your "client_configs.py file (cp client_configs.py client_configs.py.original for example)
(3)
Edit client_configs.py with the following diff as your guide:
diff -u client_configs.py.original_cdh583 client_configs.py
--- client_configs.py.original_cdh583 2017-01-03 08:28:23.922822423 -0800
+++ client_configs.py 2017-01-03 08:26:56.073261511 -0800
@@ -441,7 +441,12 @@
ret = {}
for line in output.splitlines():
if line.startswith("/"):
- path, _, _, priority_str = line.rstrip().split(" ")
+ #path, _, _, priority_str = line.rstrip().split(" ")
+ #proposed fix for Cloudera Internal Jira OPSAPS-38086
+ thisLine = line.rstrip().split(" ")
+ path = thisLine[0]
+ priority_str = thisLine[-1]
Basically, all you are doing is commenting out
path, _, _, priority_str = line.rstrip().split(" ")
and then adding:
thisLine = line.rstrip().split(" ")
path = thisLine[0]
priority_str = thisLine[-1]
WARNING: Make sure the indentiation is exact as python requires it. The end result should look like this (chanages in red):
for line in output.splitlines():
if line.startswith("/"):
#path, _, _, priority_str = line.rstrip().split(" ")
#proposed fix for OPSAPS-38086
thisLine = line.rstrip().split(" ")
path = thisLine[0]
priority_str = thisLine[-1]
# Ignore the alternative if it's not managed by CM.
if CM_MAGIC_PREFIX not in os.path.basename(path):
(4)
Save and restart the agent on the host where you made the code change:
service cloudera-scm-agent restart
Note: The code change merely creates a list of the "split" elements rather than hard-coding the columns. The code only needs the first and last columns, so if we take the first and last elements in the list, the code is happy.
This means the following "update-alternatives --display" output will no longer cause the exception in the agent:
/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.111-2.b15.el7_3.x86_64/jre - family java-1.8.0-openjdk.x86_64 priority 1800111
Please test if you have the skills to modify files and let me know your results.
Thanks,
Ben
Created 01-03-2017 09:03 AM
An note on my post: In Cloudera Manager 5.8.0 and higher, the problemmatic line is number 444:
444 path, _, _, priority_str = line.rstrip().split(" ")
Created 01-04-2017 03:13 PM
bgooley's solution worked for me.
Created 01-04-2017 05:01 PM
@CaseyBurns, did you remove the OpenJDK package or try the client_configs.py test fix? I am interested in finding out if the code change helps others as it seems to have done in my environment.
I'm glad that the problem has been solved. Thanks for any feedback.
Ben
Created 01-04-2017 05:34 PM
@bgooley i modified client_configs.py with your changes. working happily.