This issue just popped up within a week or so.
For CentOS 7, the chkconfig package appears to have been updated in the Base Repo and part of the Cloudera Manager install process grabs it as a dependency to install.
chkconfig-1.7.2 appears to break the client_configs.py script which parses output from update-alternatives:
From the GUI, saw multiple different manifestation of issues depending on the version and mechanism used. 5.7.3 install for example ended with a "No CDH version detected". The Parcel install of the latest release just hung.
[15/Dec/2016 11:20:54 +0000] 23480 MainThread agent INFO CM server guid: a54a1337-b45d-43e6-ad10-ec452ff75557 [15/Dec/2016 11:20:54 +0000] 23480 MainThread agent INFO Using parcels directory from server provided value: /opt/cloudera/parcels [15/Dec/2016 11:20:54 +0000] 23480 MainThread parcel INFO Agent does create users/groups and apply file permissions [15/Dec/2016 11:20:54 +0000] 23480 MainThread parcel_cache INFO Using /opt/cloudera/parcel-cache for parcel cache [15/Dec/2016 11:20:54 +0000] 23480 MainThread agent ERROR Caught unexpected exception in main loop. Traceback (most recent call last): File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.9.0-py2.7.egg/cmf/agent.py", line 758, in start self._init_after_first_heartbeat_response(resp_data) File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.9.0-py2.7.egg/cmf/agent.py", line 938, in _init_after_first_heartbeat_response self.client_configs.load() File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.9.0-py2.7.egg/cmf/client_configs.py", line 682, in load new_deployed.update(self._lookup_alternatives(fname)) File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.9.0-py2.7.egg/cmf/client_configs.py", line 432, in _lookup_alternatives return self._parse_alternatives(alt_name, out) File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.9.0-py2.7.egg/cmf/client_configs.py", line 444, in _parse_alternatives path, _, _, priority_str = line.rstrip().split(" ") ValueError: too many values to unpack
This ends up stopping the installer no matter which version of CDH you are attempting to install.
I was able to fix by downgrading chkconfig from 1.7.2 to 1.3.61 (and removing some of the depending packages like libreoffice).
I'm aware of the issue you are seeing based on the stack trace and Cloudera is working on how to address. There is no fix or targeted release at the time I'm writing this.
The issue is that the agent expects a certain number of columns in the update-alternatives output, but the OpenJDK usually has more.
Removing all alternatives
What I'm interested in is that you mention that chkconfig versioning matters. Are you saying that the problem *does not* occur when you use an older chkconfig version and then when you upgrade the problem happens?
If so, can you share some of the information you gathered?
To get some background on the work we have done so far, please see:
If chkconfig version is somehow influencing the alternatives command, that would be interesting to know.
Please check out that community discussion and let us know if you believe chkconfig is still involved. We are very interested to know.
Rereading your post, I see that you did confirm that chkconfig was an influencing factor. Thanks for that info.
Either way, we (Cloudera) will need to account for the possible differing number of columns in the response to prevent such parsing problems.
Upon one more review of your post, I am stuck on "and removing some of the depending packages".
My guess is that one or more of the packages may have had more than 4 columns in the alternatives output. Are you sure it was chkconfig and not one or more of the packages you uninstalled? Do you have a list of all that you removed?
You may be right there.. I saw that the client_configs.py was failing when running the "update-alternatives" command.. and noted that one of my older successful CDH installs was running an older version of update-alternatives.
In the process of reverting to the older version, it required me to remove those openjdk packages which may have resolved the issue anyways.
These were the packages removed that were dependent on the newer version of chkconfig. Along with a slew of others that were dependant on those packages.
yum remove java-1.8.0-openjdk-headless-1:22.214.171.124-2.b15.el7_3.x86_64 ntsysv-1.7.2-1.el7.x86_64 java-1.7.0-openjdk-headless-1:126.96.36.199-188.8.131.52.el7_3.x86_64
Was there a recent upstream change that added openjdk to the list of dependencies?
Nope, you were right!
It seems that the openJDK for RedHat/CentOS 7.2 levarages the "--family" option in later chkconfig. So, when I install via yum, when I look at what packages will be updated, we see:
Updating for dependencies:
chkconfig x86_64 1.7.2-1.el7
After the chkconfig update, we see that it now supports the "--family" option:
usage: alternatives --install <link> <name> <path> <priority>
[--slave <link> <name> <path>]*
A big thanks @cloudycloud for helping me take a second look at this. OpenJDK was the instigator here, but if "chkconfig x86_64 1.7.2-1.el7" is released, then we'll need to code the agent to be able to parse with the "family" option.