Support Questions

Find answers, ask questions, and share your expertise

Incorrect symlinks being created by cloudera-scm-agent

avatar
Explorer

I am attempting to reinstall my cluster and no matter what I do the installation is setting the /etc/alternatives binary symlinks to an old/invalid parcels version.

 

I followed http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM5/latest/Cloudera-Manager-Installat... 

 

During the reinstallation process the parcels are distributed.

 

[parcels]$ ls -lsa /opt/cloudera/parcels
total 12
4 drwxr-xr-x 3 root root 4096 Sep 2 22:40 .
4 drwxr-xr-x 4 root root 4096 Sep 2 22:33 ..
0 lrwxrwxrwx 1 root root 25 Sep 2 22:40 CDH -> CDH-5.1.2-1.cdh5.1.2.p0.3
4 drwxrwxr-x 10 root root 4096 Aug 26 04:03 CDH-5.1.2-1.cdh5.1.2.p0.3

 

But the cloudera-scm-agent is creating symlinks in /etc/alternatives to a different parcels version (CDH-5.1.0-1.cdh5.1.0.p0.53).

 

[parcels]$ ls -lsa /etc/alternatives | grep cloudera
4 lrwxrwxrwx 1 root root 63 Sep 2 22:40 avro-tools -> /opt/cloudera/parcels/CDH-5.1.0-1.cdh5.1.0.p0.53/bin/avro-tools
4 lrwxrwxrwx 1 root root 60 Sep 2 22:40 beeline -> /opt/cloudera/parcels/CDH-5.1.0-1.cdh5.1.0.p0.53/bin/beeline
4 lrwxrwxrwx 1 root root 61 Sep 2 22:40 catalogd -> /opt/cloudera/parcels/CDH-5.1.0-1.cdh5.1.0.p0.53/bin/catalogd
0 lrwxrwxrwx 1 root root 59 Sep 2 22:40 cli_mt -> /opt/cloudera/parcels/CDH-5.1.0-1.cdh5.1.0.p0.53/bin/cli_mt
0 lrwxrwxrwx 1 root root 59 Sep 2 22:40 cli_st -> /opt/cloudera/parcels/CDH-5.1.0-1.cdh5.1.0.p0.53/bin/cli_st
4 lrwxrwxrwx 1 root root 61 Sep 2 22:40 flume-ng -> /opt/cloudera/parcels/CDH-5.1.0-1.cdh5.1.0.p0.53/bin/flume-ng

 

Where in the HECK is this information stored either on the CM server or on the agents?

 

It's pretty important for a user to be able to remove software completely and reinstall without having to completely reinstall the operating system, which is what point I'm at right now.

1 ACCEPTED SOLUTION

avatar
Explorer

I have found the solution to the problem.

 

Cloudera-scm-agent runs a tool called /usr/lib64/cmf/service/common/alternatives.sh to generate /etc/alternatives and symlinks to /usr/bin/*.  This bash script executes update-alternatives based on the PARCELS_DIR and PARCEL_DIRNAME variables. There are files in /var/lib/alternatives/ which seem to be used as overrides for the update-alternatives tool.  Regardless of what you give update-alternatives, if there is a file in /var/lib/alternatives for that same alternative name it will use the information from the /var/lib/alternatives file.

 

For some reason the /var/lib/alternatives files for cloudera have two entries in them, one for the old parcel and one for the new parcel.  This may have happened by reinstalling without deactivating the old cluster/parcel first.

 

# cat /var/lib/alternatives/sqoop
auto
/usr/bin/sqoop

/opt/cloudera/parcels/CDH-5.1.0-1.cdh5.1.0.p0.53/bin/sqoop
10
/opt/cloudera/parcels/CDH-5.1.2-1.cdh5.1.2.p0.3/bin/sqoop
10

 

When I remove the /var/lib/alternatives/sqoop-import file and restart cloudera-scm-agent the proper symlink is created.

 

# ls -lsa /etc/alternatives/sqoop-import
4 lrwxrwxrwx 1 root root 64 Sep 3 19:00 /etc/alternatives/sqoop-import -> /opt/cloudera/parcels/CDH-5.1.2-1.cdh5.1.2.p0.3/bin/sqoop-import

 

So, in closing, it may be neccessary to remove all cloudera related /var/lib/alternatives/ files after a botched install if you do not deactivate the parcel prior to reinstall.

 

# grep -l cloudera /var/lib/alternatives/*

 

 

View solution in original post

7 REPLIES 7

avatar
Expert Contributor
Hi mshirley,
No, don't reinstall your OS, this should be a fixable problem.

Parcel symlinks are usually created when you activate or deactivate the
parcel. The default priority, however, is intentionally kept low. It's
possible at some point, an alternative got created (manually by someone, or
a by bug) at a higher priority, or, deactivation for a previous parcel was
not done properly which may have left a lingering alternative.

If I were you, I'd ensure all deactivate all CDH parcels. Then, I'd look at
the alternatives and see if there is a symlink from
/opt/cloudera/parcels/CDH -> /opt/cloudera/parcels/
exists. If so, it should NOT exist. You should go ahead, and deactivate
that alternative. Ensure no such alternatives (and hence symlinks exist).
Once you have done that, activate the parcel that you want via the CM UI.

If you run into a similar issue again, please do provide the steps you
undertook to reproduce the situation.

Sorry about the inconvenience.

Mark

avatar
Explorer

I have found the solution to the problem.

 

Cloudera-scm-agent runs a tool called /usr/lib64/cmf/service/common/alternatives.sh to generate /etc/alternatives and symlinks to /usr/bin/*.  This bash script executes update-alternatives based on the PARCELS_DIR and PARCEL_DIRNAME variables. There are files in /var/lib/alternatives/ which seem to be used as overrides for the update-alternatives tool.  Regardless of what you give update-alternatives, if there is a file in /var/lib/alternatives for that same alternative name it will use the information from the /var/lib/alternatives file.

 

For some reason the /var/lib/alternatives files for cloudera have two entries in them, one for the old parcel and one for the new parcel.  This may have happened by reinstalling without deactivating the old cluster/parcel first.

 

# cat /var/lib/alternatives/sqoop
auto
/usr/bin/sqoop

/opt/cloudera/parcels/CDH-5.1.0-1.cdh5.1.0.p0.53/bin/sqoop
10
/opt/cloudera/parcels/CDH-5.1.2-1.cdh5.1.2.p0.3/bin/sqoop
10

 

When I remove the /var/lib/alternatives/sqoop-import file and restart cloudera-scm-agent the proper symlink is created.

 

# ls -lsa /etc/alternatives/sqoop-import
4 lrwxrwxrwx 1 root root 64 Sep 3 19:00 /etc/alternatives/sqoop-import -> /opt/cloudera/parcels/CDH-5.1.2-1.cdh5.1.2.p0.3/bin/sqoop-import

 

So, in closing, it may be neccessary to remove all cloudera related /var/lib/alternatives/ files after a botched install if you do not deactivate the parcel prior to reinstall.

 

# grep -l cloudera /var/lib/alternatives/*

 

 

avatar
Contributor

I've wrote a little script to resolve the issue.

although the script is a workaround and I still need to find the proper solution.

 

details - here - https://linuxlabrat.wordpress.com/2015/03/16/removing-alternatives-of-old-cdh-parcels/

avatar

We are facing a similar problem of missing Symlinks. We have tried all the solutions mentioned in this thread but to no avail.

We are on CDH5.3.1, Cloudera Manager 5.3.2

 

"ls -lsa /etc/alternatives | grep hadoop" shows a only single entry hadoop-conf and it points to /etc/hadoop/conf.cloudera.yarn

There are no other entries for hive or hbase or sqoop or even hadoop! 

There are no entries in /var/lib/alternatives.

 

As suggested, we stopped the cluster, did a "sudo service cloudera-scm-agent restart" but the symlinks have not been created.

 

Any suggestions what is wrong and what needs to be done?

 

Thanks in advance.

 

Regards,

Yogesh

avatar

With nothing working and no help from anywhere, we finally formatted the Namenode and re-installed Cloudera Manager and CDH 5.3.

The symlinks have been created and the command "hadoop fs -ls" seems to work from the command prompt from any folder now.

 

I don't think this is the right way to address the problem so I'd still look for inputs from anyone.

 

Regards,

Yogesh

avatar
Explorer

You are correct, the steps you took shouldn't have been neccessary but it sounds like my original issue was slightly different than yours.

 

Were you able to see anything in the cloudera-scm-agent logs after the restart that might have pointed to an issue creating the symlinks?

 

You should have seen things like:

 

[03/Sep/2014 00:06:23 +0000] 21109 Thread-13 parcel_cache INFO Checking checksum of parcel CDH-5.1.2-1.cdh5.1.2.p0.3-el6.parcel...
[03/Sep/2014 00:06:28 +0000] 21109 Thread-13 parcel_cache INFO Unpacking /opt/cloudera/parcel-cache/CDH-5.1.2-1.cdh5.1.2.p0.3-el6.parcel into /opt/cloudera/parcels
[03/Sep/2014 00:06:56 +0000] 21109 MainThread parcel INFO Loading parcel manifest for: CDH-5.1.2-1.cdh5.1.2.p0.3
[03/Sep/2014 00:06:56 +0000] 21109 MainThread parcel INFO Ensuring users/groups exist for new parcel CDH-5.1.2-1.cdh5.1.2.p0.3.

...snip...

[03/Sep/2014 00:06:57 +0000] 21109 MainThread parcel INFO Ensuring correct file permissions for new parcel CDH-5.1.2-1.cdh5.1.2.p0.3.

...snip...

[03/Sep/2014 00:07:15 +0000] 21109 MainThread parcel INFO Activating system symlinks for parcel CDH-5.1.2-1.cdh5.1.2.p0.3
[03/Sep/2014 00:07:15 +0000] 21109 MainThread parcel INFO Ensuring alternatives entries are activated for parcel CDH-5.1.2-1.cdh5.1.2.p0.3.

 

What was the status of your /opt/cloudera/parcels directory?  Had you just recently changed to a new version?  Did you deactivate the old parcel before activating the new parcel?

 

Sorry I didn't respond earlier to prevent you from having to reinstall.

 

avatar

Not an issue Mshirley, many thanks for your comments.

 

To be honest, I didn't check the cloudera-scm-agent logs after the restart so cannot say whether there were any errors reported. I'll keep that in mind next time onwards.

 

Also, /opt/cloudera/parcels directory contained 2 files - one was named CDH (a link to the CDH5.3 directory, I suppose) and the actual CDH5.3xxxx directory (sorry, cannot recollect the exact name) which contained the /lib /bin and several other parcel directories . And yes, we did try deactivating the old parcel before downloading the new one. Unfortunately, nothing worked.

 

Around the same time, we also did a reboot and noticed that one of the nodes did not come up. As a result, we were forced to do the re-installation again.