Reply
Expert Contributor
Posts: 162
Registered: ‎07-29-2013

cloudera-scm-server 4.8.1. consumes 200% CPU, there are thread locks. How to debug it?

Hi, we are using Cloudera for more than 2 years. Now we are in troubles.

cloudera-scm-server runs on dedicated VM. It has 2 CPU and 8 GM RAM. It has enough disk space.
It's impossible to access web-ui. restart doesn't help.
 
 
Here is a sample from jstack
Deadlock Detection:
 
Can't print deadlocks:null
Thread 4188: (state = BLOCKED)
 
Locked ownable synchronizers:
    - None
 
Thread 4161: (state = BLOCKED)
 - java.lang.Object.wait(long) @bci=0 (Interpreted frame)
 - com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run() @bci=34, line=534 (Interpreted frame)
 
Locked ownable synchronizers:
    - None
 
Thread 4160: (state = BLOCKED)
 - java.lang.Object.wait(long) @bci=0 (Interpreted frame)
 - com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run() @bci=34, line=534 (Interpreted frame)
 
Locked ownable synchronizers:
    - None
 
Thread 4159: (state = BLOCKED)
 - java.lang.Object.wait(long) @bci=0 (Interpreted frame)
 - com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run() @bci=34, line=534 (Interpreted frame)
 
Locked ownable synchronizers:
    - None
 
Thread 3810: (state = BLOCKED)
 - com.cloudera.server.cmf.descriptor.DescriptorGeneratingCache.getDescriptorAndHashString(com.cloudera.cmf.persist.CmfEntityManager, com.cloudera.cmf.service.ServiceHandlerRegistry) @bci=0, line=75 (Interpreted frame)
 - com.cloudera.server.web.cmf.DescriptorController.getScmDescriptorJson(java.lang.String) @bci=47, line=67 (Interpreted frame)
 - sun.reflect.GeneratedMethodAccessor482.invoke(java.lang.Object, java.lang.Object[]) @bci=40 (Interpreted frame)
 - sun.reflect.DelegatingMethodAccessorImpl.invoke(java.lang.Object, java.lang.Object[]) @bci=6, line=25 (Compiled frame)
 - java.lang.reflect.Method.invoke(java.lang.Object, java.lang.Object[]) @bci=161, line=597 (Compiled frame)
 - org.springframework.web.bind.annotation.support.HandlerMethodInvoker.invokeHandlerMethod(java.lang.reflect.Method, java.lang.Object, org.springframework.web.context.request.NativeWebRequest, org.springframework.ui.ExtendedModelMap) @bci=331, line=176 (Interpreted frame)
 - org.springframework.web.servlet.mvc.annotation.AnnotationMethodHandlerAdapter.invokeHandlerMethod(javax.servlet.http.HttpServletRequest, javax.servlet.http.HttpServletResponse, java.lang.Object) @bci=57, line=436 (Interpreted frame)
 - org.springframework.web.servlet.mvc.annotation.AnnotationMethodHandlerAdapter.handle(javax.servlet.http.HttpServletRequest, javax.servlet.http.HttpServletResponse, java.lang.Object) @bci=143, line=424 (Interpreted frame)
 - org.springframework.web.servlet.DispatcherServlet.doDispatch(javax.servlet.http.HttpServletRequest, javax.servlet.http.HttpServletResponse) @bci=279, line=790 (Interpreted frame)
 - org.springframework.web.servlet.DispatcherServlet.doService(javax.servlet.http.HttpServletRequest, javax.servlet.http.HttpServletResponse) @bci=231, line=719 (Interpreted frame)
 - org.springframework.web.servlet.FrameworkServlet.processRequest(javax.servlet.http.HttpServletRequest, javax.servlet.http.HttpServletResponse) @bci=111, line=669 (Interpreted frame)
 - org.springframework.web.servlet.FrameworkServlet.doGet(javax.servlet.http.HttpServletRequest, javax.servlet.http.HttpServletResponse) @bci=3, line=574 (Interpreted frame)
 
 
Expert Contributor
Posts: 162
Registered: ‎07-29-2013

Re: cloudera-scm-server 4.8.1. consumes 200% CPU, there are thread locks. How to debug it?

Here is a cloudera-scm-server log

2014-06-08 04:14:22,144 WARN [1182853375@scm-web-81:spi.SqlExceptionHelper@143] SQL Error: 0, SQLState: null
2014-06-08 04:14:40,873 ERROR [1182853375@scm-web-81:spi.SqlExceptionHelper@144] An attempt by a client to checkout a Connection has timed out.
2014-06-08 04:14:40,873 ERROR [1182853375@scm-web-81:impl.ManagerDaoBase@97] Unable to determine Cloudera Manager URL
javax.persistence.PersistenceException: org.hibernate.exception.GenericJDBCException: Could not open connection
at org.hibernate.ejb.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1377)
at org.hibernate.ejb.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1300)
at org.hibernate.ejb.AbstractEntityManagerImpl.throwPersistenceException(AbstractEntityManagerImpl.java:1387)
at org.hibernate.ejb.TransactionImpl.begin(TransactionImpl.java:62)
at com.cloudera.enterprise.AbstractWrappedEntityManager.beginForRollbackAndReadonly(AbstractWrappedEntityManager.java:85)
at com.cloudera.cmf.persist.CmfEntityManager.beginForRollbackAndReadonly(CmfEntityManager.java:332)
at com.cloudera.cmf.persist.DatabaseExecutor.execTaskWithLocalCmfEntityManager(DatabaseExecutor.java:37)
at com.cloudera.cmf.persist.DatabaseExecutor.execTask(DatabaseExecutor.java:60)
at com.cloudera.cmf.persist.DatabaseExecutor.execReadonlyTask(DatabaseExecutor.java:79)
at com.cloudera.api.dao.impl.ManagerDaoBase.fetchCmURL(ManagerDaoBase.java:89)
at com.cloudera.api.dao.impl.ManagerDaoBase.initialize(ManagerDaoBase.java:109)
at com.cloudera.api.dao.impl.ManagerDaoBase.createProxy(ManagerDaoBase.java:120)
at com.cloudera.api.dao.impl.ScmDAOFactory.newRoleConfigGroupManager(ScmDAOFactory.java:157)
at com.cloudera.api.v3.impl.RoleConfigGroupsResourceImpl.readRoleConfigGroups(RoleConfigGroupsResourceImpl.java:54)


... 91 more
Caused by: com.mchange.v2.resourcepool.TimeoutException: A client timed out while waiting to acquire a resource from com.mchange.v2.resourcepool.BasicResourcePool@24a09e41 -- timeout at awaitAvailable()
at com.mchange.v2.resourcepool.BasicResourcePool.awaitAvailable(BasicResourcePool.java:1317)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:557)
at com.mchange.v2.resourcepool.BasicResourcePool.checkoutResource(BasicResourcePool.java:477)
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutPooledConnection(C3P0PooledConnectionPool.java:525)
... 95 more
2014-06-08 04:16:06,446 WARN [1182853375@scm-web-81:api.ApiExceptionMapper@141] Unexpected exception.
java.lang.NullPointerException
at com.cloudera.api.dao.impl.ManagerDaoBase.initialize(ManagerDaoBase.java:110)
at com.cloudera.api.dao.impl.ManagerDaoBase.createProxy(ManagerDaoBase.java:120)
at com.cloudera.api.dao.impl.ScmDAOFactory.newRoleConfigGroupManager(ScmDAOFactory.java:157)
at com.cloudera.api.v3.impl.RoleConfigGroupsResourceImpl.readRoleConfigGroups(RoleConfigGroupsResourceImpl.java:54)
at sun.reflect.GeneratedMethodAccessor720.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
Cloudera Employee
Posts: 508
Registered: ‎07-30-2013

Re: cloudera-scm-server 4.8.1. consumes 200% CPU, there are thread locks. How to debug it?

The cm server log is complaining about your database connection. Is your database up and running?

Highlighted
Expert Contributor
Posts: 162
Registered: ‎07-29-2013

Re: cloudera-scm-server 4.8.1. consumes 200% CPU, there are thread locks. How to debug it?

Yes, ofcourse. I can add roles to any services. And I get this deadlock problem when I try to delete role from HDFS1 service. There is no such problem with other services. I can easilly reproduce this issie when I try to delete HDFS1 service gateway from any host.

Announcements