Support Questions

Find answers, ask questions, and share your expertise

Recommended Upgrade process for CDH

avatar
Explorer

We are currently using Cloudera Express 5.10.0 on Ubuntu 14.04.5. This cluster was installed using cloudera manager.

 

We need to ugprade to Ubuntu 16.04 as 14.04 is approaching EOL. What is the recommended process?

 

Do we upgrade Cloudera first and then Ubuntu or Ubuntu first and then Cloudera?

 

I have tried to upgrade ubuntu first. But Ubuntu (do-release-upgrade) upgrades postgres from 9.3 to 9.5. Since the database formats between 9.3 and 9.5 are not compatible, the cloudera-scm-server-db doesn't start. I have tried to migrate the database using pg_dump/pg_restore etc but have failed so far.

 

If there is a document available somewhere, please let me know.

 

1 ACCEPTED SOLUTION

avatar
Explorer

I am able to resolve this problem using the following process:

Step 1: Take a dump of the running postgres database on Ubuntu 14.02

# sudo su

# su - postgres

# pg_dump -h localhost -p 7432 -U scm scm > scm.sql

 

Step 2: Upgrade Ubuntu to 16.04

# sudo do-release-upgrade

 

Step 3: Rename the old data directory

# mv /var/lib/cloudera-scm-server-db/data/ /var/lib/cloudera-scm-server-db/data9-3

 

Step 4: Restart cloudera-scm-server-db service. This will create an empty database which we will populate using the backup taken in step 1

# sudo service cloudera-scm-server-db restart

 

Step 5: Now restore the database

# sudo su

# su - postgres

# psql -h localhost -p 7432 -U scm

(password can be obtained like this: grep password /etc/cloudera-scm-server/db.properties)

scm> \i scm.sql

 

Step 6: Now restart cloudera-scm-server service:

# sudo service cloudera-scm-service restart

 

However, I'm now running into another problem. the cloudera-scm-agent services do not start. There is an error in supervisord.out. I will open another thread for this:

Traceback (most recent call last):
File "/usr/lib/cmf/agent/build/env/bin/supervisord", line 8, in <module>
from pkg_resources import load_entry_point
File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/pkg_resources/__init__.py", line 36, in <module>
import plistlib
File "/usr/lib/python2.7/plistlib.py", line 62, in <module>
import datetime
ImportError: No module named datetime

 

View solution in original post

7 REPLIES 7

avatar
Champion

@ps40

 

 

The below link is for enterprise edision, I believe it should be same for other edisions too

 

https://www.cloudera.com/documentation/enterprise/release-notes/topics/cm_vd.html

 

1. so the first point is, According to the above link Ubuntu Xenial 16.04 will be supported by CDH 5.12.2 or above. So if you have decided to upgrade Ubuntu then you have to upgarde CDH/CM as well

 

2. the second point is, according to the below link, "If you are upgrading CDH or Cloudera Manager as well as the OS, upgrade the OS first"

 

https://www.cloudera.com/documentation/enterprise/5-11-x/topics/cm_ag_upgrading_os.html

 

hope it may give some insights!!

avatar
Explorer

Hello,

 

Thanks for replying.

 

Yes, I will upgrade CDH as well. However, I am unable to get cloudera-scm-server service running after OS upgrade. This is because the cloudera-scm-server-db service doesn't start because the postgres version gets updated from 9.3 to 9.5 during the OS upgrade step.

 

I have tried to migrate the database but during the restore process I get this error:

ERROR: role "cloudera-scm" does not exist

 

Is there any document on how to migrate a cloudera postgres database?

 

Thanks

avatar
Explorer

I also tried to keep postgres at 9.3 while upgrading ubuntu from 14.04 to 16.04 using:

 

However this doesn't work either. After upgrade I get this error:

sudo service cloudera-scm-server-db status
● cloudera-scm-server-db.service - LSB: Cloudera SCM Server's Embedded DB
Loaded: loaded (/etc/init.d/cloudera-scm-server-db; bad; vendor preset: enabled)
Active: active (exited) since Tue 2018-04-24 21:08:39 UTC; 16min ago
Docs: man:systemd-sysv-generator(8)
Process: 2028 ExecStop=/etc/init.d/cloudera-scm-server-db stop (code=exited, status=0/SUCCESS)
Process: 2078 ExecStart=/etc/init.d/cloudera-scm-server-db start (code=exited, status=0/SUCCESS)

Apr 24 21:08:39 ip-172-30-1-250 runuser[2092]: pam_unix(runuser:session): session opened for user cloudera-scm by (uid=0)
Apr 24 21:08:39 ip-172-30-1-250 runuser[2092]: pam_unix(runuser:session): session closed for user cloudera-scm
Apr 24 21:08:39 ip-172-30-1-250 runuser[2109]: pam_unix(runuser:session): session opened for user cloudera-scm by (uid=0)
Apr 24 21:08:39 ip-172-30-1-250 cloudera-scm-server-db[2078]: bash: /usr/lib/postgresql/9.5/bin/pg_ctl: No such file or directory
Apr 24 21:08:39 ip-172-30-1-250 runuser[2109]: pam_unix(runuser:session): session closed for user cloudera-scm
Apr 24 21:08:39 ip-172-30-1-250 runuser[2111]: pam_unix(runuser:session): session opened for user cloudera-scm by (uid=0)
Apr 24 21:08:39 ip-172-30-1-250 cloudera-scm-server-db[2078]: bash: /usr/lib/postgresql/9.5/bin/pg_ctl: No such file or directory
Apr 24 21:08:39 ip-172-30-1-250 runuser[2111]: pam_unix(runuser:session): session closed for user cloudera-scm
Apr 24 21:08:39 ip-172-30-1-250 cloudera-scm-server-db[2078]: * Failed to start Cloudera manager database
Apr 24 21:08:39 ip-172-30-1-250 systemd[1]: Started LSB: Cloudera SCM Server's Embedded DB.

 

It looks like cloudera scm server is looking for 9.5 version of postgres on ubuntu 16.04. Is this hardcoded somewhere?

avatar
Contributor

I am stuck in this exact upgrade scenario. I have narrowed down the issue to the embedded postgres server. I used the do-release-upgrade process to move the server from 14.04 to 16.04 and declined to upgrade the postgres server in the process. At the end of it, I have both 9.3 and 9.5 installed and running; however I cannot find the scm or cloudera-scm roles anywhere at all and at this point I am forced to conclude this information has been lost. 

 

Unless the cloudera manager is somehow keeping this stuff somewhere else?  I'm aware of the contents of /etc/cloudera-scm-server/db.properties and of /var/lib/cloudera-scm-server-db/data/generated_password.txt (both of which survived the do-release-upgrade process).

 

I am planning to reinstall 14.04 and Cloudera and try this again (I am thankfully running this on a test installation as proof of concept) , but FIRST backing up everything in the postgres server BEFORE running the do-release-upgrade on it. And when I say save, I mean everything in the 9.3/main config files as well as a pg_dump of the database and a pg_dumpall of the globals.

 

I do not know if this will work, but am noting this here for anyone else's possible benefit (I will return here with any further news... there will undoubtedly be more folks landing in this situation of trying to do a release upgrade on a running instance...)

avatar
Contributor

OK, further info. The data is still there (it was a long day yesterday, I should have remembered this). By suppressing the 9.5 startup (by modifying its start.conf settings), I was able to manually start the cloudera database as follows (as root)

 

sudo -u cloudera-scm /usr/lib/postgresql/9.3/bin/postgres -D /var/lib/cloudera-scm-server-db/data -k /var/run/cloudera-scm-server-db &

 

Get the cloudera-scm password in this file:

cat /var/lib/cloudera-scm-server-db/data/generated_password.txt

 

And then  you can connect, according to the info in /etc/cloudera-scm-server/db.properties which has the ports, etc

 

/usr/lib/postgresql/9.3/bin/psql -U cloudera-scm -p 7432 -h localhost -d postgres
Password for user cloudera-scm:
psql (9.3.22)
Type "help" for help.

postgres=# \du
List of roles
Role name | Attributes | Member of
---------------------+------------------------------------------------+-----------
amon | | {}
cloudera-scm | Superuser, Create role, Create DB, Replication | {}
hive | | {}
hive1 | | {}
nav | | {}
navms | | {}
oozie_oozie_server | | {}
oozie_oozie_server1 | | {}
oozie_oozie_server2 | | {}
oozie_oozie_server3 | | {}
rman | | {}
scm | | {}

postgres=#

 

 

I am now going to try to dump the contents of this database, and then upload it to a 9.5 instance in order to convert the stuff. I looked at pg_upgrade(cluster) but these seem to depend on default "main" locations, eg

 

pg_lsclusters
Ver Cluster Port Status Owner Data directory Log file
9.3 main 5432 online postgres /var/lib/postgresql/9.3/main /var/log/postgresql/postgresql-9.3-main.log
9.5 main 5433 down postgres /var/lib/postgresql/9.5/main /var/log/postgresql/postgresql-9.5-main.log


etc, never seem to pick up the -D locations (and they don't seem to have a -D type option)

 

avatar
Explorer

I am able to resolve this problem using the following process:

Step 1: Take a dump of the running postgres database on Ubuntu 14.02

# sudo su

# su - postgres

# pg_dump -h localhost -p 7432 -U scm scm > scm.sql

 

Step 2: Upgrade Ubuntu to 16.04

# sudo do-release-upgrade

 

Step 3: Rename the old data directory

# mv /var/lib/cloudera-scm-server-db/data/ /var/lib/cloudera-scm-server-db/data9-3

 

Step 4: Restart cloudera-scm-server-db service. This will create an empty database which we will populate using the backup taken in step 1

# sudo service cloudera-scm-server-db restart

 

Step 5: Now restore the database

# sudo su

# su - postgres

# psql -h localhost -p 7432 -U scm

(password can be obtained like this: grep password /etc/cloudera-scm-server/db.properties)

scm> \i scm.sql

 

Step 6: Now restart cloudera-scm-server service:

# sudo service cloudera-scm-service restart

 

However, I'm now running into another problem. the cloudera-scm-agent services do not start. There is an error in supervisord.out. I will open another thread for this:

Traceback (most recent call last):
File "/usr/lib/cmf/agent/build/env/bin/supervisord", line 8, in <module>
from pkg_resources import load_entry_point
File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/pkg_resources/__init__.py", line 36, in <module>
import plistlib
File "/usr/lib/python2.7/plistlib.py", line 62, in <module>
import datetime
ImportError: No module named datetime

 

avatar
Contributor

Those are the same steps I've taken, except that restarting the -db service did not create a new data directory. Maybe I should re-check the permissions. I've also been working on creating the data directory manually with initdb, etc but somewhere I'm missing a password. Am about to rework pg_hba.conf to let clouderad-scm in without a password. If I can reload the dump (and I used pg_dumpall in  order to get he roles and permissions as well) then I think I can get this thing going. UGH...this has been an amazingly frustrating process.