Support Questions

Find answers, ask questions, and share your expertise

CDH upgrade alternatives link not updated

avatar
Explorer

We upgraded our clusters from 5.5.2 to 5.5.5 a while ago. We've since identified a few nodes where the alternatives are still referencing the 5.5.2 parcel.

root@use542ytb9:~ ( use542ytb9 )
13:15:15 $ which hbase
/usr/bin/which: no hbase in (/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/sbin:/usr/sbin:/usr/local/sbin:/root/bin)
root@use542ytb9:~ ( use542ytb9 )
root@use542ytb9:~ ( use542ytb9 )
13:15:18 $ ls /usr/bin/hbase
/usr/bin/hbase
root@use542ytb9:~ ( use542ytb9 )
13:15:24 $ ll /usr/bin/hbase
lrwxrwxrwx 1 root root 23 May 16  2016 /usr/bin/hbase -> /etc/alternatives/hbase
root@use542ytb9:~ ( use542ytb9 )
13:15:28 $ ll /etc/alternatives/hbase
lrwxrwxrwx 1 root root 63 May 16  2016 /etc/alternatives/hbase -> /opt/cloudera/parcels/CDH-5.5.2-1.cdh5.5.2.p1426.1277/bin/hbase
root@use542ytb9:~ ( use542ytb9 )
13:15:30 $ ls /opt/cloudera/parcels/CDH-5.5.2-1.cdh5.5.2.p1426.1277/bin/hbase
ls: cannot access /opt/cloudera/parcels/CDH-5.5.2-1.cdh5.5.2.p1426.1277/bin/hbase: No such file or directory
root@use542ytb9:~ ( use542ytb9 )

We've cycled the cm agent, done full decommissions and recommisisons, rebooted the nodes, and deployed client config. 

 

Since we've identified 3 nodes, we're assuming there's others as well. The hadoop services still run on these nodes, but we're unable to run hdfs, hbase, or yarn commands, which has also caused several mapreduce jobs to fail.

 

Is there a good way to repoint these alternatives to the new parcel? 

1 ACCEPTED SOLUTION

avatar
Master Collaborator
There seems to be issues around update-alternatives command. Which is often caused by a broken alternatives link under /etc/alternatives/ or a bad (zero length, see [0]) alternatives configuration file under /var/lib/alternatives, and based on your description it appears to be the former. The root cause is that Cloudera Manager Agents relies in the OS provided binary of update-alternatives, however the binary doesn't relay feedback on bad entries or problems, therefore we have to resort to manually rectifying issues like these. We have an internal improvement JIRA OPSAPS-39415 to explore options on how to make alternatives updates during upgrades more resilient. To recover from the issue, you would need to remove CDH related entries from alternatives configuration files. [0] https://bugzilla.redhat.com/show_bug.cgi?id=1016725 = = = = = = = # Stop CM agent service on node service cloudera-scm-agent stop # Delete hadoop /etc/alternatives - below will displays the rm command you'll need to issue. ls -l /etc/alternatives/ | grep "\/opt\/cloudera" | awk {'print $9'} | while read m; do if [[ -e /var/lib/alternatives/${m} ]]; then echo "rm -fv /var/lib/alternatives/${m}"; fi; echo "rm -fv /etc/alternatives/${m}"; done # Remove 0 byte /var/lib/alternatives cd /var/lib/alternatives find . -size 0 | awk '{print $1 " "}' | tr -d '\n' # The above command will give you a multi-line output of all 0 byte files in /var/lib/alternatives. Copy all the files, and put into the rm -f rm -f # Start CM agent service cloudera-scm-agent start = = = = = = =

View solution in original post

3 REPLIES 3

avatar
Champion

 

@donigrubbs

 

which link that you have followed for the upgrade?

 

https://www.cloudera.com/documentation/enterprise/release-notes/topics/cm_rn_new_changed_features.ht...

 

As per the above link, there is no intermediate version between 5.5.4 and 5.6.0 (But you are referring to 5.5.5 ?)

 

avatar
Master Collaborator
There seems to be issues around update-alternatives command. Which is often caused by a broken alternatives link under /etc/alternatives/ or a bad (zero length, see [0]) alternatives configuration file under /var/lib/alternatives, and based on your description it appears to be the former. The root cause is that Cloudera Manager Agents relies in the OS provided binary of update-alternatives, however the binary doesn't relay feedback on bad entries or problems, therefore we have to resort to manually rectifying issues like these. We have an internal improvement JIRA OPSAPS-39415 to explore options on how to make alternatives updates during upgrades more resilient. To recover from the issue, you would need to remove CDH related entries from alternatives configuration files. [0] https://bugzilla.redhat.com/show_bug.cgi?id=1016725 = = = = = = = # Stop CM agent service on node service cloudera-scm-agent stop # Delete hadoop /etc/alternatives - below will displays the rm command you'll need to issue. ls -l /etc/alternatives/ | grep "\/opt\/cloudera" | awk {'print $9'} | while read m; do if [[ -e /var/lib/alternatives/${m} ]]; then echo "rm -fv /var/lib/alternatives/${m}"; fi; echo "rm -fv /etc/alternatives/${m}"; done # Remove 0 byte /var/lib/alternatives cd /var/lib/alternatives find . -size 0 | awk '{print $1 " "}' | tr -d '\n' # The above command will give you a multi-line output of all 0 byte files in /var/lib/alternatives. Copy all the files, and put into the rm -f rm -f # Start CM agent service cloudera-scm-agent start = = = = = = =