Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

HADOOP is REMOVED by cloudera when Embedded PostgreSQL Database is error !!!

HADOOP is REMOVED by cloudera when Embedded PostgreSQL Database is error !!!

New Contributor

  Inconceivable thing happened !  Our hadoop cluster is removed after we reboot cloudera-scm-server node!

  Our company use cloudera manager to manage our hadoop clusters with more than 200T data for two years. Yesterday we had some truble with cloudera-scm-server node, we have to reboot it. However,  cloudera-scm-server   can't read data from Embedded PostgreSQL Database.  After half an hour,  we found our clusters is REMOVED by cloudera manager !!! The NameNode deamon DataNode deamon is missing! The data in path /opt/cloudera/  is missing too !

 

/var/log/cloudera-scm-server/cloudera-scm-server.log  is as below:

 

 

2016-06-28 10:51:03,363 INFO main:com.mchange.v2.c3p0.impl.AbstractPoolBackedDataSource: Initializing c3p0 pool... com.mchange.v2.c3p0.PoolBackedDataSource@3f1ec5b1 [ connectionPoolDataSource -> com.mchange.v2.c3p0.WrapperConnectionPoolDataSource@1647df60 [ acquireIncrement -> 3, acquireRetryAttempts -> 5, acquireRetryDelay -> 1000, autoCommitOnClose -> false, automaticTestTable -> null, breakAfterAcquireFailure -> false, checkoutTimeout -> 20000, connectionCustomizerClassName -> null, connectionTesterClassName -> com.mchange.v2.c3p0.impl.DefaultConnectionTester, debugUnreturnedConnectionStackTraces -> false, factoryClassLocation -> null, forceIgnoreUnresolvedTransactions -> false, identityToken -> 2rxixj9h15smzr9oto31a|235fb18d, idleConnectionTestPeriod -> 300, initialPoolSize -> 3, maxAdministrativeTaskTime -> 0, maxConnectionAge -> 0, maxIdleTime -> 0, maxIdleTimeExcessConnections -> 0, maxPoolSize -> 50, maxStatements -> 2500, maxStatementsPerConnection -> 0, minPoolSize -> 5, nestedDataSource -> com.mchange.v2.c3p0.DriverManagerDataSource@8c74636f [ description -> null, driverClass -> null, factoryClassLocation -> null, identityToken -> 2rxixj9h15smzr9oto31a|5e552a98, jdbcUrl -> jdbc:postgresql://localhost:7432/scm, properties -> {user=******, password=******, autocommit=true, release_mode=auto} ], preferredTestQuery -> null, propertyCycle -> 0, testConnectionOnCheckin -> false, testConnectionOnCheckout -> false, unreturnedConnectionTimeout -> 0, usesTraditionalReflectiveProxies -> false; userOverrides: {} ], dataSourceName -> null, factoryClassLocation -> null, identityToken -> 2rxixj9h15smzr9oto31a|4863e4f2, numHelperThreads -> 3 ]
2016-06-28 10:51:07,411 WARN com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#1:com.mchange.v2.resourcepool.BasicResourcePool: com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask@27916d4b -- Acquisition Attempt Failed!!! Clearing pending acquires. While trying to acquire a needed new resource, we failed to succeed more than the maximum number of allowed acquisition attempts (5). Last acquisition attempt exception:
org.postgresql.util.PSQLException: Connection refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections.
at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:136)
at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:66)
at org.postgresql.jdbc2.AbstractJdbc2Connection.<init>(AbstractJdbc2Connection.java:125)
at org.postgresql.jdbc3.AbstractJdbc3Connection.<init>(AbstractJdbc3Connection.java:30)
at org.postgresql.jdbc3g.AbstractJdbc3gConnection.<init>(AbstractJdbc3gConnection.java:22)
at org.postgresql.jdbc4.AbstractJdbc4Connection.<init>(AbstractJdbc4Connection.java:30)
at org.postgresql.jdbc4.Jdbc4Connection.<init>(Jdbc4Connection.java:24)
at org.postgresql.Driver.makeConnection(Driver.java:393)
at org.postgresql.Driver.connect(Driver.java:267)
at com.mchange.v2.c3p0.DriverManagerDataSource.getConnection(DriverManagerDataSource.java:135)
at com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:182)
at com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:171)
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.acquireResource(C3P0PooledConnectionPool.java:137)
at com.mchange.v2.resourcepool.BasicResourcePool.doAcquire(BasicResourcePool.java:1014)
at com.mchange.v2.resourcepool.BasicResourcePool.access$800(BasicResourcePool.java:32)
at com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask.run(BasicResourcePool.java:1810)
at com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:547)
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at java.net.Socket.connect(Socket.java:528)
at java.net.Socket.<init>(Socket.java:425)
at java.net.Socket.<init>(Socket.java:208)
at org.postgresql.core.PGStream.<init>(PGStream.java:62)
at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:76)

at org.postgresql.core.PGStream.<init>(PGStream.java:62)
at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:76)
... 16 more
2016-06-28 10:51:07,411 WARN com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#0:com.mchange.v2.resourcepool.BasicResourcePool: com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask@40573eeb -- Acquisition Attempt Failed!!! Clearing pending acquires. While trying to acquire a needed new resource, we failed to succeed more than the maximum number of allowed acquisition attempts (5). Last acquisition attempt exception:
org.postgresql.util.PSQLException: Connection refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections.
at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:136)
at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:66)
at org.postgresql.jdbc2.AbstractJdbc2Connection.<init>(AbstractJdbc2Connection.java:125)
at org.postgresql.jdbc3.AbstractJdbc3Connection.<init>(AbstractJdbc3Connection.java:30)
at org.postgresql.jdbc3g.AbstractJdbc3gConnection.<init>(AbstractJdbc3gConnection.java:22)
at org.postgresql.jdbc4.AbstractJdbc4Connection.<init>(AbstractJdbc4Connection.java:30)
at org.postgresql.jdbc4.Jdbc4Connection.<init>(Jdbc4Connection.java:24)
at org.postgresql.Driver.makeConnection(Driver.java:393)
at org.postgresql.Driver.connect(Driver.java:267)
at com.mchange.v2.c3p0.DriverManagerDataSource.getConnection(DriverManagerDataSource.java:135)
at com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:182)
at com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:171)
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.acquireResource(C3P0PooledConnectionPool.java:137)
at com.mchange.v2.resourcepool.BasicResourcePool.doAcquire(BasicResourcePool.java:1014)
at com.mchange.v2.resourcepool.BasicResourcePool.access$800(BasicResourcePool.java:32)
at com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask.run(BasicResourcePool.java:1810)
at com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:547)
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at java.net.Socket.connect(Socket.java:528)
at java.net.Socket.<init>(Socket.java:425)
at java.net.Socket.<init>(Socket.java:208)
at org.postgresql.core.PGStream.<init>(PGStream.java:62)
at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:76)
... 16 more

2016-06-28 10:51:07,419 WARN main:org.hibernate.engine.jdbc.internal.JdbcServicesImpl: HHH000342: Could not obtain connection to query metadata : Connections could not be acquired from the underlying database!
2016-06-28 10:51:07,440 INFO main:org.hibernate.dialect.Dialect: HHH000400: Using dialect: org.hibernate.dialect.PostgreSQLDialect
2016-06-28 10:51:07,471 INFO main:org.hibernate.engine.jdbc.internal.LobCreatorBuilder: HHH000422: Disabling contextual LOB creation as connection was null
2016-06-28 10:51:07,648 INFO main:org.hibernate.engine.transaction.internal.TransactionFactoryInitiator: HHH000268: Transaction strategy: org.hibernate.engine.transaction.internal.jdbc.JdbcTransactionFactory
2016-06-28 10:51:07,654 INFO main:org.hibernate.hql.internal.ast.ASTQueryTranslatorFactory: HHH000397: Using ASTQueryTranslatorFactory
2016-06-28 10:51:07,805 INFO main:org.hibernate.cache.spi.UpdateTimestampsCache: HHH000250: Starting update timestamps cache at region: org.hibernate.cache.spi.UpdateTimestampsCache
2016-06-28 10:51:07,808 WARN main:org.hibernate.cache.ehcache.AbstractEhcacheRegionFactory: HHH020003: Could not find a specific ehcache configuration for cache named [org.hibernate.cache.spi.UpdateTimestampsCache]; using defaults.
2016-06-28 10:51:07,810 INFO main:org.hibernate.cache.internal.StandardQueryCache: HHH000248: Starting query cache at region: org.hibernate.cache.internal.StandardQueryCache
2016-06-28 10:51:07,811 WARN main:org.hibernate.cache.ehcache.AbstractEhcacheRegionFactory: HHH020003: Could not find a specific ehcache configuration for cache named [org.hibernate.cache.internal.StandardQueryCache]; using defaults.
2016-06-28 10:51:07,844 INFO main:org.hibernate.validator.internal.util.Version: HV000001: Hibernate Validator 5.0.1.Final
2016-06-28 10:51:08,611 WARN main:org.hibernate.cache.ehcache.AbstractEhcacheRegionFactory: HHH020003: Could not find a specific ehcache configuration for cache named [com.cloudera.cmf.model.DbHost]; using defaults.

 

2016-06-28 10:51:09,141 WARN main:org.hibernate.cache.ehcache.AbstractEhcacheRegionFactory: HHH020003: Could not find a specific ehcache configuration for cache named [com.cloudera.cmf.model.DbRelease]; using defaults.
2016-06-28 10:51:09,146 WARN main:org.hibernate.cache.ehcache.AbstractEhcacheRegionFactory: HHH020003: Could not find a specific ehcache configuration for cache named [com.cloudera.cmf.model.DbConfig]; using defaults.
2016-06-28 10:51:09,197 WARN main:org.hibernate.cache.ehcache.AbstractEhcacheRegionFactory: HHH020003: Could not find a specific ehcache configuration for cache named [com.cloudera.cmf.model.DbService.configsForDb]; using defaults.
2016-06-28 10:51:09,200 WARN main:org.hibernate.cache.ehcache.AbstractEhcacheRegionFactory: HHH020003: Could not find a specific ehcache configuration for cache named [com.cloudera.cmf.model.DbCluster.activatedReleases]; using defaults.
2016-06-28 10:51:09,204 WARN main:org.hibernate.cache.ehcache.AbstractEhcacheRegionFactory: HHH020003: Could not find a specific ehcache configuration for cache named [com.cloudera.cmf.model.DbHostTemplate.roleConfigGroups]; using defaults.
2016-06-28 10:51:09,205 WARN main:org.hibernate.cache.ehcache.AbstractEhcacheRegionFactory: HHH020003: Could not find a specific ehcache configuration for cache named [com.cloudera.cmf.model.DbRoleConfigGroup.roles]; using defaults.
2016-06-28 10:51:09,206 WARN main:org.hibernate.cache.ehcache.AbstractEhcacheRegionFactory: HHH020003: Could not find a specific ehcache configuration for cache named [com.cloudera.cmf.model.DbHost.roles]; using defaults.
2016-06-28 10:51:09,207 WARN main:org.hibernate.cache.ehcache.AbstractEhcacheRegionFactory: HHH020003: Could not find a specific ehcache configuration for cache named [com.cloudera.cmf.model.DbService.roleConfigGroups]; using defaults.
2016-06-28 10:51:09,208 WARN main:org.hibernate.cache.ehcache.AbstractEhcacheRegionFactory: HHH020003: Could not find a specific ehcache configuration for cache named [com.cloudera.cmf.model.DbConfigContainer.configsForDb]; using defaults.
2016-06-28 10:51:09,209 WARN main:org.hibernate.cache.ehcache.AbstractEhcacheRegionFactory: HHH020003: Could not find a specific ehcache configuration for cache named [com.cloudera.cmf.model.DbProcess.parcels]; using defaults.
2016-06-28 10:51:09,210 WARN main:org.hibernate.cache.ehcache.AbstractEhcacheRegionFactory: HHH020003: Could not find a specific ehcache configuration for cache named [com.cloudera.cmf.model.DbCluster.hostTemplates]; using defaults.
2016-06-28 10:51:09,972 INFO main:com.cloudera.enterprise.CommonMain: Statistics not enabled, JMX will not be registered
2016-06-28 10:51:11,424 WARN com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#0:com.mchange.v2.resourcepool.BasicResourcePool: com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask@44ec607c -- Acquisition Attempt Failed!!! Clearing pending acquires. While trying to acquire a needed new resource, we failed to succeed more than the maximum number of allowed acquisition attempts (5). Last acquisition attempt exception:
org.postgresql.util.PSQLException: Connection refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections.
at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:136)
at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:66)
at org.postgresql.jdbc2.AbstractJdbc2Connection.<init>(AbstractJdbc2Connection.java:125)
at org.postgresql.jdbc3.AbstractJdbc3Connection.<init>(AbstractJdbc3Connection.java:30)
at org.postgresql.jdbc3g.AbstractJdbc3gConnection.<init>(AbstractJdbc3gConnection.java:22)
at org.postgresql.jdbc4.AbstractJdbc4Connection.<init>(AbstractJdbc4Connection.java:30)
at org.postgresql.jdbc4.Jdbc4Connection.<init>(Jdbc4Connection.java:24)
at org.postgresql.Driver.makeConnection(Driver.java:393)
at org.postgresql.Driver.connect(Driver.java:267)
at com.mchange.v2.c3p0.DriverManagerDataSource.getConnection(DriverManagerDataSource.java:135)
at com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:182)
at com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:171)
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.acquireResource(C3P0PooledConnectionPool.java:137)
at com.mchange.v2.resourcepool.BasicResourcePool.doAcquire(BasicResourcePool.java:1014)
at com.mchange.v2.resourcepool.BasicResourcePool.access$800(BasicResourcePool.java:32)
at com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask.run(BasicResourcePool.java:1810)
at com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:547)

11 REPLIES 11

Re: HADOOP is REMOVED by cloudera when Embedded PostgreSQL Database is error !!!

Cloudera Employee

Hi yeahmobi,

Could you verify using psql that you can indeed log in into the database? If not, you will need to bring up your Postgre database before expecting Cloudera Manager to function well.

 

I hope you did regular backups of the CM database and HDFS metadata, as we have suggested in our docs.

http://www.cloudera.com/documentation/enterprise/5-5-x/topics/cm_ag_backup_dbs.html

http://www.cloudera.com/documentation/enterprise/5-5-x/topics/cm_mc_hdfs_metadata_backup.html

 

Best Regards,

Marcell

Re: HADOOP is REMOVED by cloudera when Embedded PostgreSQL Database is error !!!

Master Guru
This may happen if you've accidentally re-inited your data or if your reboot wiped the entire /var/ vs. just /var/run and /var/tmp. The defaults in your OS usually do not do this.

You can check if you still have your old valid DB files under /var/lib/cloudera-scm-server-db/, but if you do not, then do not worry, and instead please add a new cluster to CM again, using the SAME DFS paths for NameNode and DataNodes - your data will stay intact as long as it was not under /var too (given your size, you had it on other disks most likely).

Please also check if your /etc/cloudera-scm-server/ directory has some older db.properties file backups, in case the problem is simply that of not being able to connect to the running DB instance (after ensuring 'service cloudera-scm-server-db status' state).

P.s. If this is production, please also work on not using the default embedded database in future, as recommended in our installation docs: http://www.cloudera.com/documentation/enterprise/latest/topics/cm_ig_embed_pstgrs.html ("This procedure should be used only when creating a demonstration or proof-of-concept deployment. It is not recommended for production."), so that your CM DB data may be regularly backed up by your DB Ops.

Re: HADOOP is REMOVED by cloudera when Embedded PostgreSQL Database is error !!!

New Contributor

We have already re-init CM, and re-init clusters to recover our bussiness base on hadoop.

Howerver, we should know the reason why clusters would be removed by CM ?! 

Data in /var/run and /var/tmp maybe lost as we use ec2 node on AWS, but /var/lib/cloudera-scm-server-db/  could be found.

 

We had use mysql instead of Embedded PostgreSQL Database in new CM.

I found many other problems cause by Embedded PostgreSQL Database with google. 

 

Could you help us to find why it been happened ? When CM would remove cluster?

Re: HADOOP is REMOVED by cloudera when Embedded PostgreSQL Database is error !!!

Master Guru
Would a reboot of EC2 wipe the entire /var, or just specific dirs under
/var?

Effectively your CM may have lost its DB directory which was on the
/var/lib/cloudera-scm-server-db/ location. This is not done by CM, but
we've seen it happen at a few users where a reboot ends up wiping entire
/var due to some cleanup scripts.

In essence CM cannot start or recall a cluster if the DB persisted data has
been blown away, and would therefore show an empty (no cluster) state such
as a fresh startup.

Good to hear you've moved to MySQL and were able to get your cluster back
up and running. Having a CM DB backup around would help recover faster if
anything happens to the underlying DB again.

Re: HADOOP is REMOVED by cloudera when Embedded PostgreSQL Database is error !!!

New Contributor

EC2 won't wipe /var , maybe /var/run /var/tmp would be lost. However, data in  /var/lib/cloudera-scm-server-db/  is still kept and I could tar the data of /var/lib/cloudera-scm-server-db/ to show you if it is needed.

We just found the embedded pqsql could not get up, and the cloudera-scm-server showed error logs. Could you tell the truth that why the cluster is gone when embedded pqsql could not get up while thecloudera-scm-server-db/ data been being?

And How could we make sure that the cluster would go wrong when mysql could not be connected due to network error ? 

Could I get your email ? This case is very import to us. 

 

Thank you.

Re: HADOOP is REMOVED by cloudera when Embedded PostgreSQL Database is error !!!

New Contributor

Whatever,  CM should NOT removed the clusters  automaticly  even though the db is empty ! That is terrible !

And in our situation:

1 The data of embbed pqsql is still in /var/lib/cloudera-scm-server-db/ .

2 Embbed pqsql deamon is gone due to some errors that we found nothing error about pgsql startup in pg_log.  

3 Cloudera-scm-server show errors about failures of connectting to pqsql in it's log which we had show you.

 

Is this meaning that cloudera-scm-server thought db is down after some timeout of re-connections ? 

 

Is this true that cloudera-scm-server would remove clusters when connecting errors with db no matter connecting to embedded-pqsql or connecting to mysql ?

Re: HADOOP is REMOVED by cloudera when Embedded PostgreSQL Database is error !!!

Master Guru
> Whatever, CM should NOT removed the clusters automaticly even though the db is empty !

It has not removed your cluster, rather it does not see one (and presents you an empty CM). We have an internal request to show instead an error that no DB is connectable, to point to the issue instead of causing this form of an alarm, point-taken.

Could you check the other point I've mentioned, of what lies under the /etc/cloudera-scm-server/ directory, and if there's any older db.properties autosaves inside of it?

If you are very certain the data still exists in the embedded DB's directories, then its only a matter of recovering the DB properly to get back your cluster.

Please attach/pastebin your /var/log/cloudera-scm-server-db, /var/log/cloudera-scm-server/ and /etc/cloudera-scm-server/ directory listings and contents if you're able to, to help us investigate a root cause?

Re: HADOOP is REMOVED by cloudera when Embedded PostgreSQL Database is error !!!

New Contributor

CM removed the clusters means that we could not find the deamons of NAMENODE DATANODE HBASE deamons anymore on other nodes of this cluster.

Maybe scm-agent talked with  scm-server, but scm-server see that "you should not in" , and then this scm-agent is gone with it's deamons.

 

Older db.properties  under /etc/cloudera-scm-server/ are years ago not for these days.

 

ll /etc/cloudera-scm-server/
total 48
-rw------- 1 cloudera-scm cloudera-scm 430 Jul 8 2014 cmf.keytab
-rw------- 1 cloudera-scm cloudera-scm 32 Jul 8 2014 cmf.principal
-rw------- 1 cloudera-scm cloudera-scm 1511 Nov 26 2014 db.mgmt.properties
-rw------- 1 cloudera-scm cloudera-scm 1216 Nov 26 2014 db.mgmt.properties.20141126-040136
-rw------- 1 cloudera-scm cloudera-scm 384 Jun 28 11:05 db.properties
-rw------- 1 cloudera-scm cloudera-scm 507 Oct 13 2014 db.properties.~1~
-rw------- 1 cloudera-scm cloudera-scm 507 Jul 2 2014 db.properties.20140702-103716
-rw------- 1 cloudera-scm cloudera-scm 507 Jul 2 2014 db.properties.20140702-104813
-rw------- 1 cloudera-scm cloudera-scm 507 Jul 2 2014 db.properties.20140702-105714
-rw------- 1 cloudera-scm cloudera-scm 507 Jun 7 2014 db.properties_bak
-rw------- 1 cloudera-scm cloudera-scm 289 Nov 26 2014 db.properties.rpmsave
-rw-r--r-- 1 root root 1424 Oct 13 2014 log4j.properties

 

ll -t /var/lib/cloudera-scm-server-db/data/
total 92
drwx------ 2 cloudera-scm cloudera-scm 4096 Jun 29 03:26 pg_log
drwx------ 3 cloudera-scm cloudera-scm 4096 Jun 28 10:02 pg_xlog
drwx------ 11 cloudera-scm cloudera-scm 4096 Jun 28 07:51 base
drwx------ 2 cloudera-scm cloudera-scm 4096 Jun 28 03:40 global
-rw------- 1 cloudera-scm cloudera-scm 62 Jun 28 03:40 postmaster.opts
-rw------- 1 cloudera-scm cloudera-scm 3789 Mar 22 08:36 pg_hba.conf
drwx------ 2 cloudera-scm cloudera-scm 4096 Oct 13 2015 pg_subtrans
-rw-r--r-- 1 cloudera-scm cloudera-scm 24 Nov 26 2014 scm.db.list
-rw-r--r-- 1 cloudera-scm cloudera-scm 18 Nov 26 2014 scm.db.list.20141126-040136
drwx------ 2 cloudera-scm cloudera-scm 4096 Nov 26 2014 pg_stat_tmp
-rw-r--r-- 1 cloudera-scm cloudera-scm 4 Jul 2 2014 scm.db.list.20140702-105714
-rw------- 1 cloudera-scm cloudera-scm 17050 Jul 2 2014 postgresql.conf
-rw------- 1 cloudera-scm cloudera-scm 264 Jul 2 2014 generated_password.txt
drwx------ 2 cloudera-scm cloudera-scm 4096 Jul 2 2014 pg_clog
-rw------- 1 cloudera-scm cloudera-scm 1631 Jul 2 2014 pg_ident.conf
drwx------ 4 cloudera-scm cloudera-scm 4096 Jul 2 2014 pg_multixact
drwx------ 2 cloudera-scm cloudera-scm 4096 Jul 2 2014 pg_tblspc
drwx------ 2 cloudera-scm cloudera-scm 4096 Jul 2 2014 pg_twophase
-rw------- 1 cloudera-scm cloudera-scm 4 Jul 2 2014 PG_VERSION


ll /var/log/cloudera-scm-server/
total 4716
-rw-r----- 1 cloudera-scm cloudera-scm 4817645 Jun 30 02:40 cloudera-scm-server.log
-rw-r--r-- 1 root root 117 Jun 29 03:20 cloudera-scm-server.out

 

scm-server.log could be download with url below  in 24 hours

http://ftpsin.ymtech.info:8888/ftptmp/ae4698f4-24ac-4bb6-9d65-071069ad40d2.cloudera-scm-server.log

 

 

ll -t /var/lib/cloudera-scm-server-db/data/pg_log/
total 1756
-rw------- 1 cloudera-scm cloudera-scm 196975 Jun 28 10:27 postgresql-Tue.log
-rw------- 1 cloudera-scm cloudera-scm 258912 Jun 27 23:57 postgresql-Mon.log
-rw------- 1 cloudera-scm cloudera-scm 258808 Jun 26 23:55 postgresql-Sun.log
-rw------- 1 cloudera-scm cloudera-scm 258768 Jun 25 23:54 postgresql-Sat.log
-rw------- 1 cloudera-scm cloudera-scm 258827 Jun 24 23:52 postgresql-Fri.log
-rw------- 1 cloudera-scm cloudera-scm 257114 Jun 23 23:50 postgresql-Thu.log
-rw------- 1 cloudera-scm cloudera-scm 258912 Jun 22 23:58 postgresql-Wed.log

 

pqsql log could be download with http://ftpsin.ymtech.info:8888/ftptmp/81201a87-4636-45e1-a79e-0fe310238e3e.postgresql-Tue.log

 

Thanks.

Re: HADOOP is REMOVED by cloudera when Embedded PostgreSQL Database is error !!!

Master Guru
Thanks for sharing these!

The pg_log logs seem to just show the last successful shutdown aside of general query errors preceding those.

Could you check one of those years old db.properties file for the DB credentials of postgres and try and connect to it after starting it with 'service cloudera-scm-server-db start'?

I imagine you will need a command-line as such, but replace/use with what's observed in the older db.properties files:

psql -h localhost -p 7432 -U scm

If your postgres does not come up however, then you can additionally look at the entries under /var/log/cloudera-scm-server/db.log. Do you see anything in there that pertains to why the postgres daemon didn't start for you earlier?

(P.s. embedded DB's initial root password can be found under /var/lib/cloudera-scm-server-db/data/generated_password.txt, if you need to be root)