Created on 09-30-2014 06:41 AM - edited 09-16-2022 02:08 AM
I've tried both restoring from scm postgresql database backup and from a file backup of config and postgres (database stopped for backup). Both fail complaining about missing column
Caused by: org.postgresql.util.PSQLException: ERROR: column dbrole0_.config_staleness_status does not exist
Has anyone successfully restored CM on a new server?
Thanks, John.
Created 09-30-2014 07:10 AM
It sounds most likely something's just out of sequence. I could see this working fine (and have performed it numerous times) as long as sequence is maintained.
So,
Given a consistent backup from this mythical node1, I don't see how a column can go missing unless some specific [pg_dump|mysqldump[ export options were applied, or some sort of import exclusion happened. What syntax did you use both on the export and the import side of things for the RDBMS? Also, which RDBMS are you using specifically?
Thanks,
--
Mark Schnegelberger
Created 09-30-2014 06:47 AM
Are you restoring to the exact same version of Cloudera Manager on the new server, as the version from which the backup was taken?
Created 09-30-2014 06:50 AM
Yes, I am testing the backup procedure before deploying in live. Repeated numerous times and get the same error. I upgraded from 5.0.0 to 5.1.3 first then backed up the database.
Created 09-30-2014 07:10 AM
It sounds most likely something's just out of sequence. I could see this working fine (and have performed it numerous times) as long as sequence is maintained.
So,
Given a consistent backup from this mythical node1, I don't see how a column can go missing unless some specific [pg_dump|mysqldump[ export options were applied, or some sort of import exclusion happened. What syntax did you use both on the export and the import side of things for the RDBMS? Also, which RDBMS are you using specifically?
Thanks,
--
Mark Schnegelberger
Created 09-30-2014 07:44 AM
I didn't stop all agents just CM. I intially tried just restoring scm, then amon,nav,rman and csm but failed with same problem. Local postgresql database. It is odd a file restore also fails so all config and database will be consistent.
On cm1, tried with --clean and without (just will drop tables if exist)
export PGPASSWORD=passwordfromfile
pg_dump -h localhost -p 7432 -U cloudera-scm --clean --file=csm_dump_amon.sql amon
pg_dump -h localhost -p 7432 -U cloudera-scm --clean --file=csm_dump_nav.sql nav
pg_dump -h localhost -p 7432 -U cloudera-scm --clean --file=csm_dump_rman.sql rman
pg_dump -h localhost -p 7432 -U cloudera-scm --clean --file=csm_dump_scm.sql scm
On cm2 built automatically in Amazon opsworks so has a blank CM config.
sudo service cloudera-scm-server stop
dropdb -h localhost -p 7432 -U cloudera-scm 'scm'
createdb -h localhost -p 7432 -U cloudera-scm 'scm'
export PGPASSWORD=newpasswordfromfile
psql -h localhost -p 7432 -U cloudera-scm -d amon -f csm_dump_amon.sql
psql -h localhost -p 7432 -U cloudera-scm -d nav -f csm_dump_nav.sql
psql -h localhost -p 7432 -U cloudera-scm -d rman -f csm_dump_rman.sql
psql -h localhost -p 7432 -U cloudera-scm -d scm -f csm_dump_scm.sql
sudo service cloudera-scm-server start
Java runs for a minute then dies. /var/log/cloudera-scm-server/db.log
ERROR: column dbrole0_.config_staleness_status does not exist at character 417
STATEMENT: select dbrole0_.ROLE_ID as ROLE1_28_0_, dbrole0_.OPTIMISTIC_LOCK_VERSION as OPTIMIST2_28_0_, dbrole0_.NAME as NAME3_28_0_, d
brole0_.HOST_ID as HOST4_28_0_, dbrole0_.ROLE_TYPE as ROLE5_28_0_, dbrole0_.CONFIGURED_STATUS as CONFIGUR6_28_0_, dbrole0_.MERGED_KEYTAB
as MERGED7_28_0_, dbrole0_.MAINTENANCE_COUNT as MAINTENA8_28_0_, dbrole0_.DECOMMISSION_COUNT as DECOMMIS9_28_0_, dbrole0_.HAS_EVER_STAR
TED as HAS10_28_0_, dbrole0_.CONFIG_STALENESS_STATUS as CONFIG11_28_0_, dbrole0_.SERVICE_ID as SERVICE12_28_0_, dbrole0_.ROLE_CONFIG_GRO
UP_ID as ROLE13_28_0_ from ROLES dbrole0_ where dbrole0_.ROLE_ID in ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15)
ERROR: current transaction is aborted, commands ignored until end of transaction block
STATEMENT: SELECT NULL AS TABLE_CAT, n.nspname AS TABLE_SCHEM, c.relname AS TABLE_NAME, CASE n.nspname ~ '^pg_' OR n.nspname = 'inform
ation_schema' WHEN true THEN CASE WHEN n.nspname = 'pg_catalog' OR n.nspname = 'information_schema' THEN CASE c.relkind WHEN 'r' THE
N 'SYSTEM TABLE' WHEN 'v' THEN 'SYSTEM VIEW' WHEN 'i' THEN 'SYSTEM INDEX' ELSE NULL END WHEN n.nspname = 'pg_toast' THEN CASE c
.relkind WHEN 'r' THEN 'SYSTEM TOAST TABLE' WHEN 'i' THEN 'SYSTEM TOAST INDEX' ELSE NULL END ELSE CASE c.relkind WHEN 'r' THE
N 'TEMPORARY TABLE' WHEN 'i' THEN 'TEMPORARY INDEX' WHEN 'S' THEN 'TEMPORARY SEQUENCE' WHEN 'v' THEN 'TEMPORARY VIEW' ELSE NULL
END END WHEN false THEN CASE c.relkind WHEN 'r' THEN 'TABLE' WHEN 'i' THEN 'INDEX' WHEN 'S' THEN 'SEQUENCE' WHEN 'v' THEN 'VIEW'
WHEN 'c' THEN 'TYPE' ELSE NULL END ELSE NULL END AS TABLE_TYPE, d.description AS REMARKS FROM pg_catalog.pg_namespace n, pg_cata
log.pg_class c LEFT JOIN pg_catalog.pg_description d ON (c.oid = d.objoid AND d.objsubid = 0) LEFT JOIN pg_catalog.pg_class dc ON (d.c
lassoid=dc.oid AND dc.relname='pg_class') LEFT JOIN pg_catalog.pg_namespace dn ON (dn.oid=dc.relnamespace AND dn.nspname='pg_catalog')
WHERE c.relnamespace = n.oid AND c.relname LIKE 'PROBABLYNOT' AND (false OR ( c.relkind = 'r' AND n.nspname !~ '^pg_' AND n.nspname
<> 'information_schema' ) ) ORDER BY TABLE_TYPE,TABLE_SCHEM,TABLE_NAME
Created 10-01-2014 01:26 AM
I've just checked the versions and yes they are different so no wonder it fails! Opsworks normally automatically installs the latest version but I'd overidden it for the upgrade testing. Thanks, John.