- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Incorrect symlinks being created by cloudera-scm-agent
Created on ‎09-02-2014 03:50 PM - edited ‎09-16-2022 02:06 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am attempting to reinstall my cluster and no matter what I do the installation is setting the /etc/alternatives binary symlinks to an old/invalid parcels version.
During the reinstallation process the parcels are distributed.
[parcels]$ ls -lsa /opt/cloudera/parcels
total 12
4 drwxr-xr-x 3 root root 4096 Sep 2 22:40 .
4 drwxr-xr-x 4 root root 4096 Sep 2 22:33 ..
0 lrwxrwxrwx 1 root root 25 Sep 2 22:40 CDH -> CDH-5.1.2-1.cdh5.1.2.p0.3
4 drwxrwxr-x 10 root root 4096 Aug 26 04:03 CDH-5.1.2-1.cdh5.1.2.p0.3
But the cloudera-scm-agent is creating symlinks in /etc/alternatives to a different parcels version (CDH-5.1.0-1.cdh5.1.0.p0.53).
[parcels]$ ls -lsa /etc/alternatives | grep cloudera
4 lrwxrwxrwx 1 root root 63 Sep 2 22:40 avro-tools -> /opt/cloudera/parcels/CDH-5.1.0-1.cdh5.1.0.p0.53/bin/avro-tools
4 lrwxrwxrwx 1 root root 60 Sep 2 22:40 beeline -> /opt/cloudera/parcels/CDH-5.1.0-1.cdh5.1.0.p0.53/bin/beeline
4 lrwxrwxrwx 1 root root 61 Sep 2 22:40 catalogd -> /opt/cloudera/parcels/CDH-5.1.0-1.cdh5.1.0.p0.53/bin/catalogd
0 lrwxrwxrwx 1 root root 59 Sep 2 22:40 cli_mt -> /opt/cloudera/parcels/CDH-5.1.0-1.cdh5.1.0.p0.53/bin/cli_mt
0 lrwxrwxrwx 1 root root 59 Sep 2 22:40 cli_st -> /opt/cloudera/parcels/CDH-5.1.0-1.cdh5.1.0.p0.53/bin/cli_st
4 lrwxrwxrwx 1 root root 61 Sep 2 22:40 flume-ng -> /opt/cloudera/parcels/CDH-5.1.0-1.cdh5.1.0.p0.53/bin/flume-ng
Where in the HECK is this information stored either on the CM server or on the agents?
It's pretty important for a user to be able to remove software completely and reinstall without having to completely reinstall the operating system, which is what point I'm at right now.
Created ‎09-03-2014 12:27 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have found the solution to the problem.
Cloudera-scm-agent runs a tool called /usr/lib64/cmf/service/common/alternatives.sh to generate /etc/alternatives and symlinks to /usr/bin/*. This bash script executes update-alternatives based on the PARCELS_DIR and PARCEL_DIRNAME variables. There are files in /var/lib/alternatives/ which seem to be used as overrides for the update-alternatives tool. Regardless of what you give update-alternatives, if there is a file in /var/lib/alternatives for that same alternative name it will use the information from the /var/lib/alternatives file.
For some reason the /var/lib/alternatives files for cloudera have two entries in them, one for the old parcel and one for the new parcel. This may have happened by reinstalling without deactivating the old cluster/parcel first.
# cat /var/lib/alternatives/sqoop
auto
/usr/bin/sqoop
/opt/cloudera/parcels/CDH-5.1.0-1.cdh5.1.0.p0.53/bin/sqoop
10
/opt/cloudera/parcels/CDH-5.1.2-1.cdh5.1.2.p0.3/bin/sqoop
10
When I remove the /var/lib/alternatives/sqoop-import file and restart cloudera-scm-agent the proper symlink is created.
# ls -lsa /etc/alternatives/sqoop-import
4 lrwxrwxrwx 1 root root 64 Sep 3 19:00 /etc/alternatives/sqoop-import -> /opt/cloudera/parcels/CDH-5.1.2-1.cdh5.1.2.p0.3/bin/sqoop-import
So, in closing, it may be neccessary to remove all cloudera related /var/lib/alternatives/ files after a botched install if you do not deactivate the parcel prior to reinstall.
# grep -l cloudera /var/lib/alternatives/*
Created ‎09-02-2014 04:40 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
No, don't reinstall your OS, this should be a fixable problem.
Parcel symlinks are usually created when you activate or deactivate the
parcel. The default priority, however, is intentionally kept low. It's
possible at some point, an alternative got created (manually by someone, or
a by bug) at a higher priority, or, deactivation for a previous parcel was
not done properly which may have left a lingering alternative.
If I were you, I'd ensure all deactivate all CDH parcels. Then, I'd look at
the alternatives and see if there is a symlink from
/opt/cloudera/parcels/CDH -> /opt/cloudera/parcels/
exists. If so, it should NOT exist. You should go ahead, and deactivate
that alternative. Ensure no such alternatives (and hence symlinks exist).
Once you have done that, activate the parcel that you want via the CM UI.
If you run into a similar issue again, please do provide the steps you
undertook to reproduce the situation.
Sorry about the inconvenience.
Mark
Created ‎09-03-2014 12:27 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have found the solution to the problem.
Cloudera-scm-agent runs a tool called /usr/lib64/cmf/service/common/alternatives.sh to generate /etc/alternatives and symlinks to /usr/bin/*. This bash script executes update-alternatives based on the PARCELS_DIR and PARCEL_DIRNAME variables. There are files in /var/lib/alternatives/ which seem to be used as overrides for the update-alternatives tool. Regardless of what you give update-alternatives, if there is a file in /var/lib/alternatives for that same alternative name it will use the information from the /var/lib/alternatives file.
For some reason the /var/lib/alternatives files for cloudera have two entries in them, one for the old parcel and one for the new parcel. This may have happened by reinstalling without deactivating the old cluster/parcel first.
# cat /var/lib/alternatives/sqoop
auto
/usr/bin/sqoop
/opt/cloudera/parcels/CDH-5.1.0-1.cdh5.1.0.p0.53/bin/sqoop
10
/opt/cloudera/parcels/CDH-5.1.2-1.cdh5.1.2.p0.3/bin/sqoop
10
When I remove the /var/lib/alternatives/sqoop-import file and restart cloudera-scm-agent the proper symlink is created.
# ls -lsa /etc/alternatives/sqoop-import
4 lrwxrwxrwx 1 root root 64 Sep 3 19:00 /etc/alternatives/sqoop-import -> /opt/cloudera/parcels/CDH-5.1.2-1.cdh5.1.2.p0.3/bin/sqoop-import
So, in closing, it may be neccessary to remove all cloudera related /var/lib/alternatives/ files after a botched install if you do not deactivate the parcel prior to reinstall.
# grep -l cloudera /var/lib/alternatives/*
Created ‎03-16-2015 04:01 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I've wrote a little script to resolve the issue.
although the script is a workaround and I still need to find the proper solution.
details - here - https://linuxlabrat.wordpress.com/2015/03/16/removing-alternatives-of-old-cdh-parcels/
Created ‎03-20-2015 12:21 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We are facing a similar problem of missing Symlinks. We have tried all the solutions mentioned in this thread but to no avail.
We are on CDH5.3.1, Cloudera Manager 5.3.2
"ls -lsa /etc/alternatives | grep hadoop" shows a only single entry hadoop-conf and it points to /etc/hadoop/conf.cloudera.yarn
There are no other entries for hive or hbase or sqoop or even hadoop!
There are no entries in /var/lib/alternatives.
As suggested, we stopped the cluster, did a "sudo service cloudera-scm-agent restart" but the symlinks have not been created.
Any suggestions what is wrong and what needs to be done?
Thanks in advance.
Regards,
Yogesh
Created ‎03-24-2015 07:44 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
With nothing working and no help from anywhere, we finally formatted the Namenode and re-installed Cloudera Manager and CDH 5.3.
The symlinks have been created and the command "hadoop fs -ls" seems to work from the command prompt from any folder now.
I don't think this is the right way to address the problem so I'd still look for inputs from anyone.
Regards,
Yogesh
Created on ‎03-24-2015 12:54 PM - edited ‎03-24-2015 12:55 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You are correct, the steps you took shouldn't have been neccessary but it sounds like my original issue was slightly different than yours.
Were you able to see anything in the cloudera-scm-agent logs after the restart that might have pointed to an issue creating the symlinks?
You should have seen things like:
[03/Sep/2014 00:06:23 +0000] 21109 Thread-13 parcel_cache INFO Checking checksum of parcel CDH-5.1.2-1.cdh5.1.2.p0.3-el6.parcel...
[03/Sep/2014 00:06:28 +0000] 21109 Thread-13 parcel_cache INFO Unpacking /opt/cloudera/parcel-cache/CDH-5.1.2-1.cdh5.1.2.p0.3-el6.parcel into /opt/cloudera/parcels
[03/Sep/2014 00:06:56 +0000] 21109 MainThread parcel INFO Loading parcel manifest for: CDH-5.1.2-1.cdh5.1.2.p0.3
[03/Sep/2014 00:06:56 +0000] 21109 MainThread parcel INFO Ensuring users/groups exist for new parcel CDH-5.1.2-1.cdh5.1.2.p0.3.
...snip...
[03/Sep/2014 00:06:57 +0000] 21109 MainThread parcel INFO Ensuring correct file permissions for new parcel CDH-5.1.2-1.cdh5.1.2.p0.3.
...snip...
[03/Sep/2014 00:07:15 +0000] 21109 MainThread parcel INFO Activating system symlinks for parcel CDH-5.1.2-1.cdh5.1.2.p0.3
[03/Sep/2014 00:07:15 +0000] 21109 MainThread parcel INFO Ensuring alternatives entries are activated for parcel CDH-5.1.2-1.cdh5.1.2.p0.3.
What was the status of your /opt/cloudera/parcels directory? Had you just recently changed to a new version? Did you deactivate the old parcel before activating the new parcel?
Sorry I didn't respond earlier to prevent you from having to reinstall.
Created ‎03-24-2015 11:43 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Not an issue Mshirley, many thanks for your comments.
To be honest, I didn't check the cloudera-scm-agent logs after the restart so cannot say whether there were any errors reported. I'll keep that in mind next time onwards.
Also, /opt/cloudera/parcels directory contained 2 files - one was named CDH (a link to the CDH5.3 directory, I suppose) and the actual CDH5.3xxxx directory (sorry, cannot recollect the exact name) which contained the /lib /bin and several other parcel directories . And yes, we did try deactivating the old parcel before downloading the new one. Unfortunately, nothing worked.
Around the same time, we also did a reboot and noticed that one of the nodes did not come up. As a result, we were forced to do the re-installation again.
