Support Questions

Find answers, ask questions, and share your expertise

Cloudera Manager UI doesnt start, clouder-scm-server dead and pid file exists

avatar
Explorer

Hi All,

 

Clouder Manager UI doesnt start at all. Tried opening localhost:7180 or 127.0.0.1:7180, nothing worked.

Opening a new issue since the exception for cloudera-scm-server is different in my case and none of solutions in previous thread really matched. 

Cloudera manager worked till yesterday and I applied no update till today when its UI is not startng at all.

I dont understand why issue started today suddenly. 😞

I checked other threads in this community and did following steps in set of attempts, but clouder-scm-server still fails to keep started:-

 

1. Checked Cloudera-scm-server status:

[root@quickstart cloudera-scm-server]# service cloudera-scm-server status
cloudera-scm-server dead but pid file exists

2. Removed pid as

 rm /var/run/cloudera-scm-server.pid
3. Tried to stop cloudera-scm-server-db, but got issue!

 

[cloudera@quickstart ~]$ sudo service cloudera-scm-server-db stop
cloudera-scm-server-db: unrecognized service

4. One of the thread suggested that postgresql might not be running,so i checked in  /etc/rc.d/init.d/ for postgresql.

  But its not there!

  [root@quickstart ~]# service postgresql start
   postgresql: unrecognized service

 

5. I checked in /etc/hosts if loopback address is missing 127.0.0.1 localhost. But the entry is already there.

6. Checked if port 7180 is listening by command:

    netstat -na | grep 7180

    But I get no output here.

7. I checked /var/log/cloudera-scm-server/cloudera-scm-server.out for an issue, there I get below exception:-

 

JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera
Exception in thread "MainThread" javax.persistence.PersistenceException: org.hibernate.exception.JDBCConnectionException: Could not open connection
	at org.hibernate.ejb.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1387)
	at org.hibernate.ejb.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1310)
	at org.hibernate.ejb.AbstractEntityManagerImpl.throwPersistenceException(AbstractEntityManagerImpl.java:1397)
	at org.hibernate.ejb.TransactionImpl.begin(TransactionImpl.java:62)
	at com.cloudera.enterprise.AbstractWrappedEntityManager.beginForRollbackAndReadonly(AbstractWrappedEntityManager.java:89)
	at com.cloudera.cmf.persist.CmfEntityManager.beginForRollbackAndReadonly(CmfEntityManager.java:347)
	at com.cloudera.server.cmf.Main.initializeCustomQueryCache(Main.java:705)
	at com.cloudera.server.cmf.Main.<init>(Main.java:307)
	at com.cloudera.server.cmf.Main.main(Main.java:216)
Caused by: org.hibernate.exception.JDBCConnectionException: Could not open connection
	at org.hibernate.exception.internal.SQLStateConversionDelegate.convert(SQLStateConversionDelegate.java:132)
	at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:49)
	at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:125)
	at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:110)
	at org.hibernate.engine.jdbc.internal.LogicalConnectionImpl.obtainConnection(LogicalConnectionImpl.java:221)
	at org.hibernate.engine.jdbc.internal.LogicalConnectionImpl.getConnection(LogicalConnectionImpl.java:157)
	at org.hibernate.engine.transaction.internal.jdbc.JdbcTransaction.doBegin(JdbcTransaction.java:67)
	at org.hibernate.engine.transaction.spi.AbstractTransactionImpl.begin(AbstractTransactionImpl.java:160)
	at org.hibernate.internal.SessionImpl.beginTransaction(SessionImpl.java:1426)
	at org.hibernate.ejb.TransactionImpl.begin(TransactionImpl.java:59)
	... 5 more
Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

The last packet successfully received from the server was 108,228 milliseconds ago.  The last packet sent successfully to the server was 1 milliseconds ago.
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
	at com.mysql.jdbc.Util.handleNewInstance(Util.java:377)
	at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1036)
	at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3427)
	at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3327)
	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3814)
	at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2435)
	at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2582)
	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2526)
	at com.mysql.jdbc.ConnectionImpl.setTransactionIsolation(ConnectionImpl.java:5133)
	at com.mchange.v2.c3p0.impl.NewProxyConnection.setTransactionIsolation(NewProxyConnection.java:701)
	at org.hibernate.service.jdbc.connections.internal.C3P0ConnectionProvider.getConnection(C3P0ConnectionProvider.java:86)
	at org.hibernate.internal.AbstractSessionImpl$NonContextualJdbcConnectionAccess.obtainConnection(AbstractSessionImpl.java:292)
	at org.hibernate.engine.jdbc.internal.LogicalConnectionImpl.obtainConnection(LogicalConnectionImpl.java:214)
	... 10 more
Caused by: java.io.EOFException: Can not read response from server. Expected to read 4 bytes, read 0 bytes before connection was unexpectedly lost.
	at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:2914)
	at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3337)
	... 20 more

I do not understand precise meaning of this exception and root cause for the issue.

NOTE: Internet is working on VM 

 

Please, please help suggest solution. 

 

 

 

 

 

 

11 REPLIES 11

avatar
Explorer

Clouder team, your help is needed badly on this issue; I am stuck because of issue.

Please let me know if any details are needed.

Thanks in advance.

avatar
Guru

Hi @manu_009 ,

 

From the log error snippet:

Exception in thread "MainThread" javax.persistence.PersistenceException: org.hibernate.exception.JDBCConnectionException: Could not open connection

It looks like there is an issue for Cloudera Manager to reach to its database. Is the log file /var/log/cloudera-scm-server/cloudera-scm-server.out timestamp up to date? If so, I suggest you checking the database status.

 

1. Open the file /etc/cloudera-scm-server/db.properties

2. Examine the file and get the database details

3. Try to log in to the database using the credential and see if the db connection is working.

 

Thanks and hope this helps,

Li

Li Wang, Technical Solution Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

Terms of Service

Community Guidelines

How to use the forum

avatar
Super Collaborator

I have seen this issue when I updated a CM embedded postgresql Database without stopping all the services that were using that DB.

 

Try killing the pid.  Sometimes once you kill it, another pid # replaces it as the dead pid.   In those cases, I have rebooted the host.  This fixed it for me.   I only deal with testing clusters, not production, so this might not be your best course of action.

 

Tina

avatar
Explorer

HI @lwang ,

Thanks for your reply.

Yes, the timestamp is updated for log.

I actually re-checked the db.properties. It contains below properties:-

 

com.cloudera.cmf.db.type=mysql
com.cloudera.cmf.db.host=localhost
com.cloudera.cmf.db.name=cm
com.cloudera.cmf.db.user=cm
com.cloudera.cmf.db.setupType=EXTERNAL
com.cloudera.cmf.db.password=cloudera

 

With above credentials, mysql is able connect. NO issues there.

 

My default db is mysql and I have never installed postgresql nor upgraded the CDH setup yet.

I have configured for single node cluster.

Till the last time when cloudera manager had worked, the used services were :

HDFS,Hive, Zookeeper, Spark, Impala.

 

What can be the issue then?

Please help!

avatar
Guru

 

Hi @manu_009 ,

 

 

I would suggest you do these steps to troubleshoot it:

1) make a backup file for /etc/cloudera-scm-server/db.properties

 

2) trying to run below command to test connection to the cm database:

 

/opt/cloudera/cm/schema/scm_prepare_database.sh mysql cm cm

https://www.cloudera.com/documentation/enterprise/6/6.2/topics/prepare_cm_database.html#cmig_topic_5...

 

The above command will recreate db.properties file.

 

3) See if step 2 works or not. If not, please tell us what is the error.

 

Thanks,

Li

Li Wang, Technical Solution Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

Terms of Service

Community Guidelines

How to use the forum

avatar
Explorer

Hi @lwang,

 

I am petrified and fearful now. When I look into location:-

/opt/cloudera/

 there is no directory as "cm" there!

What i find there are below directiories:-

 

[root@quickstart cloudera]# pwd
/opt/cloudera
[root@quickstart cloudera]# ls
csd    parcel-cache      parcel-repo    parcels
[root@quickstart cloudera]#

 

What could be the real issue behind this?

Please help!

avatar
Guru

Hi @manu_009 ,

 

Sorry for my late reply, I was out the office for few days.

 

It looks like you are using older CM versions (5.x) where the scm_prepare_database.sh script resides in different location.

 

Please run this command instead:

 

/usr/share/cmf/schema/scm_prepare_database.sh mysql cm cm

Documention to reference is here:

https://www.cloudera.com/documentation/enterprise/5-16-x/topics/prepare_cm_database.html

 

Thanks,

Li

Li Wang, Technical Solution Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

Terms of Service

Community Guidelines

How to use the forum

avatar
Explorer

Hey @lwang,

 

Thanks for reply.

I tried to run the command as below:-

 

[root@quickstart ~]# /usr/share/cmf/schema/scm_prepare_database.sh mysql cm cm
Enter SCM password: 
JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera
Verifying that we can write to /etc/cloudera-scm-server
Creating SCM configuration file in /etc/cloudera-scm-server
Executing:  /usr/java/jdk1.7.0_67-cloudera/bin/java -cp /usr/share/java/mysql-connector-java.jar:/usr/share/java/oracle-connector-java.jar:/usr/share/cmf/schema/../lib/* com.cloudera.enterprise.dbutil.DbCommandExecutor /etc/cloudera-scm-server/db.properties com.cloudera.cmf.db.
log4j:ERROR Could not find value for key log4j.appender.A
log4j:ERROR Could not instantiate appender named "A".
[2019-07-14 15:30:28,099] INFO     0[main] - com.cloudera.enterprise.dbutil.DbCommandExecutor.testDbConnection(DbCommandExecutor.java) - Successfully connected to database.
All done, your SCM database is configured correctly!
[root@quickstart ~]#
[root@quickstart ~]# service cloudera-scm-server status
cloudera-scm-server dead but pid file exists

But my problem is still there after this!

I have no way to understand what causes this issue 😞

Please help!

avatar
Master Guru

Hello @manu_009 ,

 

Before performing any steps, let's step back and look at the situation you presented:

 

(1)

 

You had a working cluster for a long time, changed nothing, and all of a sudden you are faced with a situation where Cloudera Manager won't start.

 

(2)

 

The stack trace you provided (without context) indicates a problem where the MySQL driver is not receiving a response from the MySQL server:

 

The last packet successfully received from the server was 108,228 milliseconds ago. The last packet sent successfully to the server was 1 milliseconds ago.

 

Caused by: java.io.EOFException: Can not read response from server. Expected to read 4 bytes, read 0 bytes before connection was unexpectedly lost.
at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:2914)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3337)
... 20 more

 

I imagine there is more to this wrapped exception, but the above suffices in confirming that CM is sending requests out through the connection to the MySQL database, but it is not receiving replies as expected.

 

(3)

 

- From this, we know that TCP is OK in the sense that the client (CM) can open a connection with the server (mysql).

- Since you are not using postgres, do not worry about trying the cloudera-scm-server-db thing (that's only for the embedded db and you are not using it... the result you got is expected as the package was correctly not installed)

 

NEXT STEPS:

 

(1)

 

Verify startup problem in /var/log/cloudera-scm-server/cloudera-scm-server.log

 

While cloudera-scm-server.out may have good information, most of the server logging is in cloudera-scm-server.log and it will have some context as well.

 

(2)

 

Since the evidence presented points to there being a problem outside of CM and the MySQL driver connection, it stands to reason we focus our efforts there.  Since the Server is MySQL, we should check to see if there are any connections open from CM to MySQL and if there are queries that are running on MySQL as the "cm" user on the "cm" db.

 

I recommend

- running:

 service cloudera-scm-server stop

 verify that there are no processes returned when running: ps aux |grep cmf.Main

 

- Log into mysql server and run: SHOW FULL PROCESSLIST;

 This will show you a list of running queries

 Check if there are any "cm" ones

- If there are, there should not be since CM is stopped.  Stale or hung queries can cause the type of behavior you are seeing, so we can kill off any queries being run by the cm user on the cm db.

- run this to kill any left over queries: KILL <ID>

 

(3)

 

If there are no "hung" or long-running CM queries, and CM still does not start (with the same exceptions), then we may need to look at the connection over the wire with tcpdump/WireShark.

 

I'm hoping, though, that the issue you are seeing is related to a situation with some "hung" queries.  You mentioned you could connect to MySQL outside of CM, so that supports the theory that this is a CM-specific DB problem.