Created 03-26-2018 03:41 AM
I've updated jdk from 1.8_131 to 1.8_151 for CDH5. So i need to restart the cluster to make it take affect. In the begining i use cloudrea manager web page to restart, but it failed when zookeeper started which is the first step. Then I made a bad choice which is close cloudrea manager in terminal including kill -9 postgresql process. After that, i could't open the cloudrea manager web page.
I use following instructions to start the cluster.
service cloudera-scm-server-db start
service cloudera-scm-server start
service cloudera-scm-agent start
All of them are failed, because `/var/log/cloudera-scm-server` and `/var/log/cloudera-scm-agent` disappear.
So I creat these two files manually also include `dg.log` and `cloudera-scm-agent.log`
At this time, the `server` and `agent` could start. But `server-db` still can not. The next is some details.
Starting cloudera-scm-server-db (via systemctl): Job for cloudera-scm-server-db.service failed because the control process exited with error code.
See "systemctl status cloudera-scm-server-db.service" and "journalctl -xe" for details
journalctl -xe
The CM is using external DB. Failed to start embedded DB service, giving up
So, what should i do now? thank you thank you very much!!!
Created 03-26-2018 11:22 PM
Created 03-26-2018 07:05 PM
Created 03-26-2018 11:11 PM
Yes,I read the log file. and change the
/etc/cloudera-scm-server/db.properties
before the change ,it had a line
com.cloudera.cmf.db.setType=External
I delete it .and now the server-db recover to run. cloudera-scm-server and agent work well too.
But the web page on port 7180 still can't be connect.
This is the error in cloudera-scm-server.out
JAVA_HOME=/usr/java/jdk1.8.0_151 Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support was removed in 8.0 log4j:ERROR setFile(null,true) call failed. java.io.FileNotFoundException: /var/log/cloudera-scm-server/cloudera-scm-server.log (Permission denied) at java.io.FileOutputStream.open0(Native Method) at java.io.FileOutputStream.open(FileOutputStream.java:270) at java.io.FileOutputStream.<init>(FileOutputStream.java:213) at java.io.FileOutputStream.<init>(FileOutputStream.java:133) at org.apache.log4j.FileAppender.setFile(FileAppender.java:294) at org.apache.log4j.RollingFileAppender.setFile(RollingFileAppender.java:207) at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165) at org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307) at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172) at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104) at org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:809) at org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:735) at org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:615) at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:502) at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:547) at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:483) at org.apache.log4j.LogManager.<clinit>(LogManager.java:127) at org.slf4j.impl.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:73) at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:242) at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:254) at com.cloudera.server.cmf.Main.<clinit>(Main.java:158) log4j:ERROR setFile(null,true) call failed. java.io.FileNotFoundException: /var/log/cloudera-scm-server/cmf-server-perf.log (Permission denied) at java.io.FileOutputStream.open0(Native Method) at java.io.FileOutputStream.open(FileOutputStream.java:270) at java.io.FileOutputStream.<init>(FileOutputStream.java:213) at java.io.FileOutputStream.<init>(FileOutputStream.java:133) at org.apache.log4j.FileAppender.setFile(FileAppender.java:294) at org.apache.log4j.RollingFileAppender.setFile(RollingFileAppender.java:207) at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165) at org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307) at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172) at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104) at org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:809) at org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:735) at org.apache.log4j.PropertyConfigurator.parseCatsAndRenderers(PropertyConfigurator.java:639) at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:504) at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:547) at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:483) at org.apache.log4j.LogManager.<clinit>(LogManager.java:127) at org.slf4j.impl.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:73) at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:242) at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:254) at com.cloudera.server.cmf.Main.<clinit>(Main.java:158) Exception in thread "main" org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'com.cloudera.server.cmf.TrialState': Cannot resolve reference to bean 'entityManagerFactoryBean' while setting constructor argument; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'entityManagerFactoryBean': FactoryBean threw exception on object creation; nested exception is org.hibernate.exception.GenericJDBCException: Could not open connection at org.springframework.beans.factory.support.BeanDefinitionValueResolver.resolveReference(BeanDefinitionValueResolver.java:328) at org.springframework.beans.factory.support.BeanDefinitionValueResolver.resolveValueIfNecessary(BeanDefinitionValueResolver.java:106) at org.springframework.beans.factory.support.ConstructorResolver.resolveConstructorArguments(ConstructorResolver.java:616) at org.springframework.beans.factory.support.ConstructorResolver.autowireConstructor(ConstructorResolver.java:148) at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.autowireConstructor(AbstractAutowireCapableBeanFactory.java:1003) at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBeanInstance(AbstractAutowireCapableBeanFactory.java:907) at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:485) at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:456) at org.springframework.beans.factory.support.AbstractBeanFactory$1.getObject(AbstractBeanFactory.java:293) at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:222) at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:290) at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:192) at org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:585) at org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:895) at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:425) at com.cloudera.server.cmf.Main.bootstrapSpringContext(Main.java:392) at com.cloudera.server.cmf.Main.<init>(Main.java:242) at com.cloudera.server.cmf.Main.main(Main.java:216) Caused by: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'entityManagerFactoryBean': FactoryBean threw exception on object creation; nested exception is org.hibernate.exception.GenericJDBCException: Could not open connection at org.springframework.beans.factory.support.FactoryBeanRegistrySupport.doGetObjectFromFactoryBean(FactoryBeanRegistrySupport.java:149) at org.springframework.beans.factory.support.FactoryBeanRegistrySupport.getObjectFromFactoryBean(FactoryBeanRegistrySupport.java:102) at org.springframework.beans.factory.support.AbstractBeanFactory.getObjectForBeanInstance(AbstractBeanFactory.java:1440) at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:247) at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:192) at org.springframework.beans.factory.support.BeanDefinitionValueResolver.resolveReference(BeanDefinitionValueResolver.java:322) ... 17 more Caused by: org.hibernate.exception.GenericJDBCException: Could not open connection at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:54) at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:125) at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:110) at org.hibernate.engine.jdbc.internal.LogicalConnectionImpl.obtainConnection(LogicalConnectionImpl.java:221) at org.hibernate.engine.jdbc.internal.LogicalConnectionImpl.getConnection(LogicalConnectionImpl.java:157) at org.hibernate.engine.jdbc.internal.JdbcCoordinatorImpl.coordinateWork(JdbcCoordinatorImpl.java:282) at org.hibernate.internal.SessionImpl.doWork(SessionImpl.java:2000) at org.hibernate.internal.SessionImpl.doWork(SessionImpl.java:1986) at com.cloudera.enterprise.dbutil.DbUtil.getSchemaVersion(DbUtil.java:192) at com.cloudera.server.cmf.bootstrap.EntityManagerFactoryBean.checkVersionDoFail(EntityManagerFactoryBean.java:269) at com.cloudera.server.cmf.bootstrap.EntityManagerFactoryBean.getObject(EntityManagerFactoryBean.java:127) at com.cloudera.server.cmf.bootstrap.EntityManagerFactoryBean.getObject(EntityManagerFactoryBean.java:65) at org.springframework.beans.factory.support.FactoryBeanRegistrySupport.doGetObjectFromFactoryBean(FactoryBeanRegistrySupport.java:142) ... 22 more Caused by: java.sql.SQLException: Connections could not be acquired from the underlying database! at com.mchange.v2.sql.SqlUtils.toSQLException(SqlUtils.java:106) at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutPooledConnection(C3P0PooledConnectionPool.java:529) at com.mchange.v2.c3p0.impl.AbstractPoolBackedDataSource.getConnection(AbstractPoolBackedDataSource.java:128) at org.hibernate.service.jdbc.connections.internal.C3P0ConnectionProvider.getConnection(C3P0ConnectionProvider.java:84) at org.hibernate.internal.AbstractSessionImpl$NonContextualJdbcConnectionAccess.obtainConnection(AbstractSessionImpl.java:292) at org.hibernate.engine.jdbc.internal.LogicalConnectionImpl.obtainConnection(LogicalConnectionImpl.java:214) ... 31 more Caused by: com.mchange.v2.resourcepool.CannotAcquireResourceException: A ResourcePool could not acquire a resource from its primary factory or source. at com.mchange.v2.resourcepool.BasicResourcePool.awaitAvailable(BasicResourcePool.java:1319) at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:557) at com.mchange.v2.resourcepool.BasicResourcePool.checkoutResource(BasicResourcePool.java:477) at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutPooledConnection(C3P0PooledConnectionPool.java:525) ... 35 more
I find the file cloudera-scm-server-db do not exist. So I manually creat one. Besides, cloudera-scm-server.log is also not exist. I manually creat on too.
So now all the service is running unless the web page can't be accessed.
The port 7180 is not on listeing when use netstat to see.
The selinux is disabled now.
pg_ctl: server is running (PID: 16867) /usr/bin/postgres "-D" "/var/lib/cloudera-scm-server-db/data" "-k" "/var/run/cloudera-scm-server/" Checking jexec statusnetconsole module not loaded Configured devices: lo enp0s25 enp9s0 Currently active devices: lo enp0s25 enp9s0 virbr0
Created 03-26-2018 11:22 PM
Created 03-27-2018 02:05 AM
Thank you for your help! I recreat those file by user "cloudera- scm" .Now the log can be wrote in .
Here is the error information in cloudera-scm-server.log
2018-03-27 16:21:49,681 WARN com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#1:com.mchange.v2.resourcepool.BasicResourcePool: com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask@31d74216 -- Acquisition Attempt Failed!!! Clearing pending acquires. While trying to acquire a needed new resource, we failed to succeed more than the maximum number of allowed acquisition attempts (5). Last acquisition attempt exception: org.postgresql.util.PSQLException: FATAL: password authentication failed for user "scm" at org.postgresql.core.v3.ConnectionFactoryImpl.doAuthentication(ConnectionFactoryImpl.java:291) at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:108) at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:66) at org.postgresql.jdbc2.AbstractJdbc2Connection.<init>(AbstractJdbc2Connection.java:125) at org.postgresql.jdbc3.AbstractJdbc3Connection.<init>(AbstractJdbc3Connection.java:30) at org.postgresql.jdbc3g.AbstractJdbc3gConnection.<init>(AbstractJdbc3gConnection.java:22) at org.postgresql.jdbc4.AbstractJdbc4Connection.<init>(AbstractJdbc4Connection.java:30) at org.postgresql.jdbc4.Jdbc4Connection.<init>(Jdbc4Connection.java:24) at org.postgresql.Driver.makeConnection(Driver.java:393) at org.postgresql.Driver.connect(Driver.java:267) at com.mchange.v2.c3p0.DriverManagerDataSource.getConnection(DriverManagerDataSource.java:135) at com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:182) at com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:171) at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.acquireResource(C3P0PooledConnectionPool.java:137) at com.mchange.v2.resourcepool.BasicResourcePool.doAcquire(BasicResourcePool.java:1014) at com.mchange.v2.resourcepool.BasicResourcePool.access$800(BasicResourcePool.java:32) at com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask.run(BasicResourcePool.java:1810) at com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:547)
So i check the db.properties
[root@dbs01 cloudera-scm-server]# cat /etc/cloudera-scm-server/db.properties # Auto-generated by scm_prepare_database.sh on Tue Mar 27 11:30:37 CST 2018 # # For information describing how to configure the Cloudera Manager Server # to connect to databases, see the "Cloudera Manager Installation Guide." # com.cloudera.cmf.db.type=postgresql com.cloudera.cmf.db.host=localhost:7432 com.cloudera.cmf.db.name=scm com.cloudera.cmf.db.user=scm com.cloudera.cmf.db.password=scm_password com.cloudera.cmf.db.setupType=EMBEDDED
I changg db.password to the password which i get from generated_password.txt
But it still has the same error.
I check that port 7432 is listening
So what should i do next? Very very thanks.
Created 03-27-2018 03:22 AM
I change scm's password by https://www.cloudera.com/documentation/enterprise/5-6-x/topics/cm_ig_embed_pstgrs.html#cmig_topic_5_...
and update it in db.properties .
Now, the web page could work.
But my hdfs in datanode could not work.
[root@dbs02 ~]# service --status-all ● cloudera-scm-agent.service - LSB: Cloudera SCM Agent Loaded: loaded (/etc/rc.d/init.d/cloudera-scm-agent; bad; vendor preset: disabled) Active: active (exited) since Tue 2018-03-27 17:45:30 CST; 26min ago Docs: man:systemd-sysv-generator(8) Process: 20467 ExecStop=/etc/rc.d/init.d/cloudera-scm-agent stop (code=exited, status=0/SUCCESS) Process: 20527 ExecStart=/etc/rc.d/init.d/cloudera-scm-agent start (code=exited, status=0/SUCCESS) Mar 27 17:45:29 dbs02 systemd[1]: Starting LSB: Cloudera SCM Agent... Mar 27 17:45:29 dbs02 su[20542]: (to root) root on none Mar 27 17:45:30 dbs02 cloudera-scm-agent[20527]: Starting cloudera-scm-agent: [ OK ] Mar 27 17:45:30 dbs02 systemd[1]: Started LSB: Cloudera SCM Agent. ● cloudera-scm-server.service - LSB: Cloudera SCM Server Loaded: loaded (/etc/rc.d/init.d/cloudera-scm-server; bad; vendor preset: disabled) Active: active (exited) since Thu 2018-01-18 13:23:43 CST; 2 months 7 days ago Docs: man:systemd-sysv-generator(8) Mar 04 16:08:49 dbs02 systemd[1]: [/run/systemd/generator.late/cloudera-scm-server.service:15] Failed to add dependency on +postgresql.service, ignoring: Invalid argument Mar 04 16:08:49 dbs02 systemd[1]: [/run/systemd/generator.late/cloudera-scm-server.service:13] Failed to add dependency on +cloudera-scm-server-db.service, ignoring: Invalid argument Mar 04 16:08:49 dbs02 systemd[1]: [/run/systemd/generator.late/cloudera-scm-server.service:14] Failed to add dependency on +mysql.service, ignoring: Invalid argument Mar 04 16:08:49 dbs02 systemd[1]: [/run/systemd/generator.late/cloudera-scm-server.service:15] Failed to add dependency on +postgresql.service, ignoring: Invalid argument Mar 04 16:12:23 dbs02 systemd[1]: [/run/systemd/generator.late/cloudera-scm-server.service:13] Failed to add dependency on +cloudera-scm-server-db.service, ignoring: Invalid argument Mar 04 16:12:23 dbs02 systemd[1]: [/run/systemd/generator.late/cloudera-scm-server.service:14] Failed to add dependency on +mysql.service, ignoring: Invalid argument Mar 04 16:12:23 dbs02 systemd[1]: [/run/systemd/generator.late/cloudera-scm-server.service:15] Failed to add dependency on +postgresql.service, ignoring: Invalid argument Mar 04 16:12:24 dbs02 systemd[1]: [/run/systemd/generator.late/cloudera-scm-server.service:13] Failed to add dependency on +cloudera-scm-server-db.service, ignoring: Invalid argument Mar 04 16:12:24 dbs02 systemd[1]: [/run/systemd/generator.late/cloudera-scm-server.service:14] Failed to add dependency on +mysql.service, ignoring: Invalid argument Mar 04 16:12:24 dbs02 systemd[1]: [/run/systemd/generator.late/cloudera-scm-server.service:15] Failed to add dependency on +postgresql.service, ignoring: Invalid argument pg_ctl: server is running (PID: 1319) /usr/bin/postgres "-D" "/var/lib/cloudera-scm-server-db/data" "-k" "/var/run/cloudera-scm-server/" Checking jexec statusnetconsole module not loaded Configured devices: lo enp0s25 enp9s0 Currently active devices: lo enp0s25 enp9s
"bad gateway" is the error which the web page show .
And now the Zookeeper start to run.
Created 03-27-2018 04:46 PM
I see you state "my hdfs in datanode could not work".
Please show us or explain what you tried to do, what you expected, and what actually happened.
What helped you conclude there was something wrong with HDFS?
Cheers,
Ben
Created 03-27-2018 06:14 PM
Yes, As i wrote on previous replies. I update my jdk version and restart the cluster.
But the restart crashed. so on my master node"dbs01", I manually closed the server, agent and server-db service on terminal using instrunctions "stop".
Then i want to restart server, agent and server-db service on terminal, but encounter Can not start cloudera-scm-server-db because of "The CM is using external DB",
Fortunately, on your colleague's help i restart above three service on "dbs01" and recover to visit CM web page. During this period i did nothing on HDFS datanode,such as “dbs02”.
Then I use CM web page to restart total cluster. However it fail at start HDFS datanode "dbs02" I look up the status on dbs02, the information was shown in my previous reply. And the error in log is
HTTP ERROR 502 Problem accessing /cmf/process/195/logs. Reason: BAD_GATEWAY
when starting and
HTTP ERROR 500 Problem accessing /cmf/role/46/logs. Reason: INTERNAL_SERVER_ERROR
when starting fail.
The above is the detail .If need more detail, please contact me . thanks for your help!
Created 03-29-2018 05:50 AM
I've check agent's log file /var/log/cloudera-scm-agent.log (dbs02).
The error is No route to host.
It's a common error, just need to check the firewall condition and close it .
So until now, my problem is solved, thanks all of you.!!!