Created on 06-29-2016 11:23 AM - edited 09-16-2022 03:28 AM
Hi all,
I am trying to backup the Cloudera Manager Database but I can't find how to actually do it.
I found this:
"Cloudera Manager - Contains all the information about services you have configured and their role assignments, all configuration history, commands, users, and running processes. This relatively small database (<100 MB) is the most important to back up."
Here is the link: http://www.cloudera.com/documentation/enterprise/5-5-x/topics/cm_ag_backup_dbs.html
Any help is much appreciated.
Thank you,
D.
Created on 06-29-2016 02:55 PM - edited 06-29-2016 06:52 PM
The backup command is RDBMS vendor specific and it depends on your database server (PostgresSQL, MySQL, or Oracle)
Any idea what is your database type? If you're unsure use the below command to confirm.
# grep -oE 'db.(type|host)=.*' /etc/cloudera-scm-server/db.properties
If it's embedded/Postgres, use the guide 'Backing Up PostgreSQL Databases' [1]
Created on 06-29-2016 02:55 PM - edited 06-29-2016 06:52 PM
The backup command is RDBMS vendor specific and it depends on your database server (PostgresSQL, MySQL, or Oracle)
Any idea what is your database type? If you're unsure use the below command to confirm.
# grep -oE 'db.(type|host)=.*' /etc/cloudera-scm-server/db.properties
If it's embedded/Postgres, use the guide 'Backing Up PostgreSQL Databases' [1]
Created on 06-30-2016 03:11 AM - edited 06-30-2016 03:35 AM
Hi Michalis,
It worked, thank you! 🙂
Created on 10-26-2016 09:16 AM - edited 10-26-2016 09:19 AM
The document says to run the below as root user from the host running the CM server
# pg_dump -h hostname -p 7432 -U scm > /tmp/scm_server_db_backup.$(date +%Y%m%d)
I have a query here. There is not database specified to be dumped (-d switch). It only specifies the user as scm (-U). How does it dump the Cloudera Manager database?
We use the embedded database in our cluster. All databases like Hive metastore, Oozie etc are configured to the same host where Cloudera Manager server runs and embedded postgres database. If I have to backup the other databases, is it enough to provide the user (-U) switch only? Does it dump the database owned by that user only by default?
Also would it create any perfomance issues if I dump the Cloudera Manager , Hive and Oozie databases in a production environment?
Created on 10-26-2016 09:39 AM - edited 10-26-2016 01:49 PM
Hi AnandMS,
From what I can tell the pg_dump command expects pg_dump [connection-option...] [option...] [dbname] see
https://www.postgresql.org/docs/9.1/static/app-pgdump.html
grep -oE 'db.(user|name)=.*' /etc/cloudera-scm-server/db.properties in my environment give me the followig output.
db.name=scm db.user=cloudera-scm
threfore db.user I set in -U cloudera-scm followed by scm.
effectively your command should look like
# pg_dump -h hostname -p 7432 -U cloudera-scm scm > /tmp/scm_server_db_backup.$(date +%Y%m%d)
Created 10-27-2016 01:58 AM
Thank you Michalis. It makes sense. I was confused because the db name was not being specified in the Cloudera documentation.
Also do we have to take any precautions before dumping the database (CM , Hive metastore and Oozie dbs)? Does it affect anything if I do it in a cluster when jobs are running?
Created 11-04-2016 01:47 AM
Hi Michalis,
It worked just fine for me. The dumping process took only a few seconds and there was no impact.
Thank you!
Created 02-02-2017 01:32 AM