Hi All,
Clouder Manager UI doesnt start at all. Tried opening localhost:7180 or 127.0.0.1:7180, nothing worked.
Opening a new issue since the exception for cloudera-scm-server is different in my case and none of solutions in previous thread really matched.
Cloudera manager worked till yesterday and I applied no update till today when its UI is not startng at all.
I dont understand why issue started today suddenly. 😞
I checked other threads in this community and did following steps in set of attempts, but clouder-scm-server still fails to keep started:-
1. Checked Cloudera-scm-server status:
[root@quickstart cloudera-scm-server]# service cloudera-scm-server status
cloudera-scm-server dead but pid file exists
2. Removed pid as
rm /var/run/cloudera-scm-server.pid
3. Tried to stop cloudera-scm-server-db, but got issue!
[cloudera@quickstart ~]$ sudo service cloudera-scm-server-db stop
cloudera-scm-server-db: unrecognized service
4. One of the thread suggested that postgresql might not be running,so i checked in /etc/rc.d/init.d/ for postgresql.
But its not there!
[root@quickstart ~]# service postgresql start
postgresql: unrecognized service
5. I checked in /etc/hosts if loopback address is missing 127.0.0.1 localhost. But the entry is already there.
6. Checked if port 7180 is listening by command:
netstat -na | grep 7180
But I get no output here.
7. I checked /var/log/cloudera-scm-server/cloudera-scm-server.out for an issue, there I get below exception:-
JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera Exception in thread "MainThread" javax.persistence.PersistenceException: org.hibernate.exception.JDBCConnectionException: Could not open connection at org.hibernate.ejb.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1387) at org.hibernate.ejb.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1310) at org.hibernate.ejb.AbstractEntityManagerImpl.throwPersistenceException(AbstractEntityManagerImpl.java:1397) at org.hibernate.ejb.TransactionImpl.begin(TransactionImpl.java:62) at com.cloudera.enterprise.AbstractWrappedEntityManager.beginForRollbackAndReadonly(AbstractWrappedEntityManager.java:89) at com.cloudera.cmf.persist.CmfEntityManager.beginForRollbackAndReadonly(CmfEntityManager.java:347) at com.cloudera.server.cmf.Main.initializeCustomQueryCache(Main.java:705) at com.cloudera.server.cmf.Main.<init>(Main.java:307) at com.cloudera.server.cmf.Main.main(Main.java:216) Caused by: org.hibernate.exception.JDBCConnectionException: Could not open connection at org.hibernate.exception.internal.SQLStateConversionDelegate.convert(SQLStateConversionDelegate.java:132) at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:49) at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:125) at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:110) at org.hibernate.engine.jdbc.internal.LogicalConnectionImpl.obtainConnection(LogicalConnectionImpl.java:221) at org.hibernate.engine.jdbc.internal.LogicalConnectionImpl.getConnection(LogicalConnectionImpl.java:157) at org.hibernate.engine.transaction.internal.jdbc.JdbcTransaction.doBegin(JdbcTransaction.java:67) at org.hibernate.engine.transaction.spi.AbstractTransactionImpl.begin(AbstractTransactionImpl.java:160) at org.hibernate.internal.SessionImpl.beginTransaction(SessionImpl.java:1426) at org.hibernate.ejb.TransactionImpl.begin(TransactionImpl.java:59) ... 5 more Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure The last packet successfully received from the server was 108,228 milliseconds ago. The last packet sent successfully to the server was 1 milliseconds ago. at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at com.mysql.jdbc.Util.handleNewInstance(Util.java:377) at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1036) at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3427) at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3327) at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3814) at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2435) at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2582) at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2526) at com.mysql.jdbc.ConnectionImpl.setTransactionIsolation(ConnectionImpl.java:5133) at com.mchange.v2.c3p0.impl.NewProxyConnection.setTransactionIsolation(NewProxyConnection.java:701) at org.hibernate.service.jdbc.connections.internal.C3P0ConnectionProvider.getConnection(C3P0ConnectionProvider.java:86) at org.hibernate.internal.AbstractSessionImpl$NonContextualJdbcConnectionAccess.obtainConnection(AbstractSessionImpl.java:292) at org.hibernate.engine.jdbc.internal.LogicalConnectionImpl.obtainConnection(LogicalConnectionImpl.java:214) ... 10 more Caused by: java.io.EOFException: Can not read response from server. Expected to read 4 bytes, read 0 bytes before connection was unexpectedly lost. at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:2914) at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3337) ... 20 more
I do not understand precise meaning of this exception and root cause for the issue.
NOTE: Internet is working on VM
Please, please help suggest solution.
Created 07-02-2019 01:04 AM
Clouder team, your help is needed badly on this issue; I am stuck because of issue.
Please let me know if any details are needed.
Thanks in advance.
Created 07-02-2019 10:32 AM
Hi @manu_009 ,
From the log error snippet:
Exception in thread "MainThread" javax.persistence.PersistenceException: org.hibernate.exception.JDBCConnectionException: Could not open connection
It looks like there is an issue for Cloudera Manager to reach to its database. Is the log file /var/log/cloudera-scm-server/cloudera-scm-server.out timestamp up to date? If so, I suggest you checking the database status.
1. Open the file /etc/cloudera-scm-server/db.properties
2. Examine the file and get the database details
3. Try to log in to the database using the credential and see if the db connection is working.
Thanks and hope this helps,
Li
Li Wang, Technical Solution Manager
Created 07-02-2019 11:06 AM
I have seen this issue when I updated a CM embedded postgresql Database without stopping all the services that were using that DB.
Try killing the pid. Sometimes once you kill it, another pid # replaces it as the dead pid. In those cases, I have rebooted the host. This fixed it for me. I only deal with testing clusters, not production, so this might not be your best course of action.
Tina
Created 07-02-2019 11:51 PM
HI @lwang ,
Thanks for your reply.
Yes, the timestamp is updated for log.
I actually re-checked the db.properties. It contains below properties:-
com.cloudera.cmf.db.type=mysql
com.cloudera.cmf.db.host=localhost
com.cloudera.cmf.db.name=cm
com.cloudera.cmf.db.user=cm
com.cloudera.cmf.db.setupType=EXTERNAL
com.cloudera.cmf.db.password=cloudera
With above credentials, mysql is able connect. NO issues there.
My default db is mysql and I have never installed postgresql nor upgraded the CDH setup yet.
I have configured for single node cluster.
Till the last time when cloudera manager had worked, the used services were :
HDFS,Hive, Zookeeper, Spark, Impala.
What can be the issue then?
Please help!
Created 07-03-2019 02:37 PM
Hi @manu_009 ,
I would suggest you do these steps to troubleshoot it:
1) make a backup file for /etc/cloudera-scm-server/db.properties
2) trying to run below command to test connection to the cm database:
/opt/cloudera/cm/schema/scm_prepare_database.sh mysql cm cm
The above command will recreate db.properties file.
3) See if step 2 works or not. If not, please tell us what is the error.
Thanks,
Li
Li Wang, Technical Solution Manager
Created 07-04-2019 06:57 AM
Hi @lwang,
I am petrified and fearful now. When I look into location:-
/opt/cloudera/
there is no directory as "cm" there!
What i find there are below directiories:-
[root@quickstart cloudera]# pwd
/opt/cloudera
[root@quickstart cloudera]# ls
csd parcel-cache parcel-repo parcels
[root@quickstart cloudera]#
What could be the real issue behind this?
Please help!
Created 07-10-2019 07:44 AM
Hi @manu_009 ,
Sorry for my late reply, I was out the office for few days.
It looks like you are using older CM versions (5.x) where the scm_prepare_database.sh script resides in different location.
Please run this command instead:
/usr/share/cmf/schema/scm_prepare_database.sh mysql cm cm
Documention to reference is here:
https://www.cloudera.com/documentation/enterprise/5-16-x/topics/prepare_cm_database.html
Thanks,
Li
Li Wang, Technical Solution Manager
Created 07-14-2019 03:16 AM
Hey @lwang,
Thanks for reply.
I tried to run the command as below:-
[root@quickstart ~]# /usr/share/cmf/schema/scm_prepare_database.sh mysql cm cm Enter SCM password: JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera Verifying that we can write to /etc/cloudera-scm-server Creating SCM configuration file in /etc/cloudera-scm-server Executing: /usr/java/jdk1.7.0_67-cloudera/bin/java -cp /usr/share/java/mysql-connector-java.jar:/usr/share/java/oracle-connector-java.jar:/usr/share/cmf/schema/../lib/* com.cloudera.enterprise.dbutil.DbCommandExecutor /etc/cloudera-scm-server/db.properties com.cloudera.cmf.db. log4j:ERROR Could not find value for key log4j.appender.A log4j:ERROR Could not instantiate appender named "A". [2019-07-14 15:30:28,099] INFO 0[main] - com.cloudera.enterprise.dbutil.DbCommandExecutor.testDbConnection(DbCommandExecutor.java) - Successfully connected to database. All done, your SCM database is configured correctly! [root@quickstart ~]# [root@quickstart ~]# service cloudera-scm-server status cloudera-scm-server dead but pid file exists
But my problem is still there after this!
I have no way to understand what causes this issue 😞
Please help!
Created 07-10-2019 08:43 AM
Hello @manu_009 ,
Before performing any steps, let's step back and look at the situation you presented:
(1)
You had a working cluster for a long time, changed nothing, and all of a sudden you are faced with a situation where Cloudera Manager won't start.
(2)
The stack trace you provided (without context) indicates a problem where the MySQL driver is not receiving a response from the MySQL server:
The last packet successfully received from the server was 108,228 milliseconds ago. The last packet sent successfully to the server was 1 milliseconds ago.
Caused by: java.io.EOFException: Can not read response from server. Expected to read 4 bytes, read 0 bytes before connection was unexpectedly lost.
at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:2914)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3337)
... 20 more
I imagine there is more to this wrapped exception, but the above suffices in confirming that CM is sending requests out through the connection to the MySQL database, but it is not receiving replies as expected.
(3)
- From this, we know that TCP is OK in the sense that the client (CM) can open a connection with the server (mysql).
- Since you are not using postgres, do not worry about trying the cloudera-scm-server-db thing (that's only for the embedded db and you are not using it... the result you got is expected as the package was correctly not installed)
NEXT STEPS:
(1)
Verify startup problem in /var/log/cloudera-scm-server/cloudera-scm-server.log
While cloudera-scm-server.out may have good information, most of the server logging is in cloudera-scm-server.log and it will have some context as well.
(2)
Since the evidence presented points to there being a problem outside of CM and the MySQL driver connection, it stands to reason we focus our efforts there. Since the Server is MySQL, we should check to see if there are any connections open from CM to MySQL and if there are queries that are running on MySQL as the "cm" user on the "cm" db.
I recommend
- running:
service cloudera-scm-server stop
verify that there are no processes returned when running: ps aux |grep cmf.Main
- Log into mysql server and run: SHOW FULL PROCESSLIST;
This will show you a list of running queries
Check if there are any "cm" ones
- If there are, there should not be since CM is stopped. Stale or hung queries can cause the type of behavior you are seeing, so we can kill off any queries being run by the cm user on the cm db.
- run this to kill any left over queries: KILL <ID>
(3)
If there are no "hung" or long-running CM queries, and CM still does not start (with the same exceptions), then we may need to look at the connection over the wire with tcpdump/WireShark.
I'm hoping, though, that the issue you are seeing is related to a situation with some "hung" queries. You mentioned you could connect to MySQL outside of CM, so that supports the theory that this is a CM-specific DB problem.