Created 04-23-2018 12:24 PM
We are currently using Cloudera Express 5.10.0 on Ubuntu 14.04.5. This cluster was installed using cloudera manager.
We need to ugprade to Ubuntu 16.04 as 14.04 is approaching EOL. What is the recommended process?
Do we upgrade Cloudera first and then Ubuntu or Ubuntu first and then Cloudera?
I have tried to upgrade ubuntu first. But Ubuntu (do-release-upgrade) upgrades postgres from 9.3 to 9.5. Since the database formats between 9.3 and 9.5 are not compatible, the cloudera-scm-server-db doesn't start. I have tried to migrate the database using pg_dump/pg_restore etc but have failed so far.
If there is a document available somewhere, please let me know.
Created 04-27-2018 06:08 PM
I am able to resolve this problem using the following process:
Step 1: Take a dump of the running postgres database on Ubuntu 14.02
# sudo su
# su - postgres
# pg_dump -h localhost -p 7432 -U scm scm > scm.sql
Step 2: Upgrade Ubuntu to 16.04
# sudo do-release-upgrade
Step 3: Rename the old data directory
# mv /var/lib/cloudera-scm-server-db/data/ /var/lib/cloudera-scm-server-db/data9-3
Step 4: Restart cloudera-scm-server-db service. This will create an empty database which we will populate using the backup taken in step 1
# sudo service cloudera-scm-server-db restart
Step 5: Now restore the database
# sudo su
# su - postgres
# psql -h localhost -p 7432 -U scm
(password can be obtained like this: grep password /etc/cloudera-scm-server/db.properties)
scm> \i scm.sql
Step 6: Now restart cloudera-scm-server service:
# sudo service cloudera-scm-service restart
However, I'm now running into another problem. the cloudera-scm-agent services do not start. There is an error in supervisord.out. I will open another thread for this:
Traceback (most recent call last):
File "/usr/lib/cmf/agent/build/env/bin/supervisord", line 8, in <module>
from pkg_resources import load_entry_point
File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/pkg_resources/__init__.py", line 36, in <module>
import plistlib
File "/usr/lib/python2.7/plistlib.py", line 62, in <module>
import datetime
ImportError: No module named datetime
Created 04-24-2018 02:25 AM
The below link is for enterprise edision, I believe it should be same for other edisions too
https://www.cloudera.com/documentation/enterprise/release-notes/topics/cm_vd.html
1. so the first point is, According to the above link Ubuntu Xenial 16.04 will be supported by CDH 5.12.2 or above. So if you have decided to upgrade Ubuntu then you have to upgarde CDH/CM as well
2. the second point is, according to the below link, "If you are upgrading CDH or Cloudera Manager as well as the OS, upgrade the OS first"
https://www.cloudera.com/documentation/enterprise/5-11-x/topics/cm_ag_upgrading_os.html
hope it may give some insights!!
Created 04-24-2018 07:41 AM
Hello,
Thanks for replying.
Yes, I will upgrade CDH as well. However, I am unable to get cloudera-scm-server service running after OS upgrade. This is because the cloudera-scm-server-db service doesn't start because the postgres version gets updated from 9.3 to 9.5 during the OS upgrade step.
I have tried to migrate the database but during the restore process I get this error:
ERROR: role "cloudera-scm" does not exist
Is there any document on how to migrate a cloudera postgres database?
Thanks
Created 04-24-2018 02:33 PM
I also tried to keep postgres at 9.3 while upgrading ubuntu from 14.04 to 16.04 using:
However this doesn't work either. After upgrade I get this error:
sudo service cloudera-scm-server-db status
● cloudera-scm-server-db.service - LSB: Cloudera SCM Server's Embedded DB
Loaded: loaded (/etc/init.d/cloudera-scm-server-db; bad; vendor preset: enabled)
Active: active (exited) since Tue 2018-04-24 21:08:39 UTC; 16min ago
Docs: man:systemd-sysv-generator(8)
Process: 2028 ExecStop=/etc/init.d/cloudera-scm-server-db stop (code=exited, status=0/SUCCESS)
Process: 2078 ExecStart=/etc/init.d/cloudera-scm-server-db start (code=exited, status=0/SUCCESS)
Apr 24 21:08:39 ip-172-30-1-250 runuser[2092]: pam_unix(runuser:session): session opened for user cloudera-scm by (uid=0)
Apr 24 21:08:39 ip-172-30-1-250 runuser[2092]: pam_unix(runuser:session): session closed for user cloudera-scm
Apr 24 21:08:39 ip-172-30-1-250 runuser[2109]: pam_unix(runuser:session): session opened for user cloudera-scm by (uid=0)
Apr 24 21:08:39 ip-172-30-1-250 cloudera-scm-server-db[2078]: bash: /usr/lib/postgresql/9.5/bin/pg_ctl: No such file or directory
Apr 24 21:08:39 ip-172-30-1-250 runuser[2109]: pam_unix(runuser:session): session closed for user cloudera-scm
Apr 24 21:08:39 ip-172-30-1-250 runuser[2111]: pam_unix(runuser:session): session opened for user cloudera-scm by (uid=0)
Apr 24 21:08:39 ip-172-30-1-250 cloudera-scm-server-db[2078]: bash: /usr/lib/postgresql/9.5/bin/pg_ctl: No such file or directory
Apr 24 21:08:39 ip-172-30-1-250 runuser[2111]: pam_unix(runuser:session): session closed for user cloudera-scm
Apr 24 21:08:39 ip-172-30-1-250 cloudera-scm-server-db[2078]: * Failed to start Cloudera manager database
Apr 24 21:08:39 ip-172-30-1-250 systemd[1]: Started LSB: Cloudera SCM Server's Embedded DB.
It looks like cloudera scm server is looking for 9.5 version of postgres on ubuntu 16.04. Is this hardcoded somewhere?
Created on 04-25-2018 04:11 PM - edited 04-25-2018 04:12 PM
I am stuck in this exact upgrade scenario. I have narrowed down the issue to the embedded postgres server. I used the do-release-upgrade process to move the server from 14.04 to 16.04 and declined to upgrade the postgres server in the process. At the end of it, I have both 9.3 and 9.5 installed and running; however I cannot find the scm or cloudera-scm roles anywhere at all and at this point I am forced to conclude this information has been lost.
Unless the cloudera manager is somehow keeping this stuff somewhere else? I'm aware of the contents of /etc/cloudera-scm-server/db.properties and of /var/lib/cloudera-scm-server-db/data/generated_password.txt (both of which survived the do-release-upgrade process).
I am planning to reinstall 14.04 and Cloudera and try this again (I am thankfully running this on a test installation as proof of concept) , but FIRST backing up everything in the postgres server BEFORE running the do-release-upgrade on it. And when I say save, I mean everything in the 9.3/main config files as well as a pg_dump of the database and a pg_dumpall of the globals.
I do not know if this will work, but am noting this here for anyone else's possible benefit (I will return here with any further news... there will undoubtedly be more folks landing in this situation of trying to do a release upgrade on a running instance...)
Created 04-26-2018 12:49 PM
OK, further info. The data is still there (it was a long day yesterday, I should have remembered this). By suppressing the 9.5 startup (by modifying its start.conf settings), I was able to manually start the cloudera database as follows (as root)
sudo -u cloudera-scm /usr/lib/postgresql/9.3/bin/postgres -D /var/lib/cloudera-scm-server-db/data -k /var/run/cloudera-scm-server-db &
Get the cloudera-scm password in this file:
cat /var/lib/cloudera-scm-server-db/data/generated_password.txt
And then you can connect, according to the info in /etc/cloudera-scm-server/db.properties which has the ports, etc
/usr/lib/postgresql/9.3/bin/psql -U cloudera-scm -p 7432 -h localhost -d postgres
Password for user cloudera-scm:
psql (9.3.22)
Type "help" for help.
postgres=# \du
List of roles
Role name | Attributes | Member of
---------------------+------------------------------------------------+-----------
amon | | {}
cloudera-scm | Superuser, Create role, Create DB, Replication | {}
hive | | {}
hive1 | | {}
nav | | {}
navms | | {}
oozie_oozie_server | | {}
oozie_oozie_server1 | | {}
oozie_oozie_server2 | | {}
oozie_oozie_server3 | | {}
rman | | {}
scm | | {}
postgres=#
I am now going to try to dump the contents of this database, and then upload it to a 9.5 instance in order to convert the stuff. I looked at pg_upgrade(cluster) but these seem to depend on default "main" locations, eg
pg_lsclusters
Ver Cluster Port Status Owner Data directory Log file
9.3 main 5432 online postgres /var/lib/postgresql/9.3/main /var/log/postgresql/postgresql-9.3-main.log
9.5 main 5433 down postgres /var/lib/postgresql/9.5/main /var/log/postgresql/postgresql-9.5-main.log
etc, never seem to pick up the -D locations (and they don't seem to have a -D type option)
Created 04-27-2018 06:08 PM
I am able to resolve this problem using the following process:
Step 1: Take a dump of the running postgres database on Ubuntu 14.02
# sudo su
# su - postgres
# pg_dump -h localhost -p 7432 -U scm scm > scm.sql
Step 2: Upgrade Ubuntu to 16.04
# sudo do-release-upgrade
Step 3: Rename the old data directory
# mv /var/lib/cloudera-scm-server-db/data/ /var/lib/cloudera-scm-server-db/data9-3
Step 4: Restart cloudera-scm-server-db service. This will create an empty database which we will populate using the backup taken in step 1
# sudo service cloudera-scm-server-db restart
Step 5: Now restore the database
# sudo su
# su - postgres
# psql -h localhost -p 7432 -U scm
(password can be obtained like this: grep password /etc/cloudera-scm-server/db.properties)
scm> \i scm.sql
Step 6: Now restart cloudera-scm-server service:
# sudo service cloudera-scm-service restart
However, I'm now running into another problem. the cloudera-scm-agent services do not start. There is an error in supervisord.out. I will open another thread for this:
Traceback (most recent call last):
File "/usr/lib/cmf/agent/build/env/bin/supervisord", line 8, in <module>
from pkg_resources import load_entry_point
File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/pkg_resources/__init__.py", line 36, in <module>
import plistlib
File "/usr/lib/python2.7/plistlib.py", line 62, in <module>
import datetime
ImportError: No module named datetime
Created 04-27-2018 06:59 PM
Those are the same steps I've taken, except that restarting the -db service did not create a new data directory. Maybe I should re-check the permissions. I've also been working on creating the data directory manually with initdb, etc but somewhere I'm missing a password. Am about to rework pg_hba.conf to let clouderad-scm in without a password. If I can reload the dump (and I used pg_dumpall in order to get he roles and permissions as well) then I think I can get this thing going. UGH...this has been an amazingly frustrating process.