Created 01-10-2025 12:05 AM
Facing issue while
2025-01-10 10:59:40,491 INFO avro-servlet-hb-processor-0:com.cloudera.server.common.AgentAvroServlet: (1 skipped) AgentAvroServlet: heartbeat processing stats: average=1310ms, min=17ms, max=73273ms.
2025-01-10 10:59:40,499 ERROR agentServer-57:com.cloudera.server.common.AgentAvroServlet: Error processing Avro request
org.eclipse.jetty.io.EofException
at org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:279)
at org.eclipse.jetty.io.WriteFlusher.flush(WriteFlusher.java:422)
at org.eclipse.jetty.io.WriteFlusher.write(WriteFlusher.java:277)
at org.eclipse.jetty.io.AbstractEndPoint.write(AbstractEndPoint.java:381)
at org.eclipse.jetty.server.HttpConnection$SendCallback.process(HttpConnection.java:810)
at org.eclipse.jetty.util.IteratingCallback.processing(IteratingCallback.java:241)
at org.eclipse.jetty.util.IteratingCallback.iterate(IteratingCallback.java:223)
at org.eclipse.jetty.server.HttpConnection.send(HttpConnection.java:528)
at org.eclipse.jetty.server.HttpChannel.sendResponse(HttpChannel.java:915)
at org.eclipse.jetty.server.HttpChannel.write(HttpChannel.java:987)
at org.eclipse.jetty.server.HttpOutput.channelWrite(HttpOutput.java:284)
at org.eclipse.jetty.server.HttpOutput.channelWrite(HttpOutput.java:268)
at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:833)
at com.cloudera.enterprise.SafeAvroHttpTransceiver.writeBuffers(SafeAvroHttpTransceiver.java:117)
at com.cloudera.server.common.HttpConnectorServer$FunctionsImpl.write(HttpConnectorServer.java:123)
at com.cloudera.server.common.HttpConnectorServer$FunctionsImpl.write(HttpConnectorServer.java:110)
at com.cloudera.server.common.AgentAvroServlet.doPost(AgentAvroServlet.java:92)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:665)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:750)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:791)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:550)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1435)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1350)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at org.eclipse.jetty.server.Server.handle(Server.java:516)
at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388)
at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:273)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
at com.cloudera.server.common.MovingStats$1.get(MovingStats.java:32)
at com.cloudera.server.common.MovingStats$1.get(MovingStats.java:29)
at com.cloudera.server.common.MovingStats.measure(MovingStats.java:41)
at com.cloudera.server.common.MovingStats.measure(MovingStats.java:29)
at com.cloudera.server.common.MonitoringThreadPool$RunnableImpl.run(MonitoringThreadPool.java:135)
at com.cloudera.server.common.BoundedQueuedThreadPool$2.run(BoundedQueuedThreadPool.java:94)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:773)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:905)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Broken pipe
at sun.nio.ch.FileDispatcherImpl.writev0(Native Method)
at sun.nio.ch.SocketDispatcher.writev(SocketDispatcher.java:51)
at sun.nio.ch.IOUtil.write(IOUtil.java:148)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:504)
at java.nio.channels.SocketChannel.write(SocketChannel.java:502)
at org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:273)
... 48 more
2025-01-10 10:59:40,501 WARN agentServer-57:org.eclipse.jetty.server.HttpChannel: /
java.lang.IllegalStateException: ABORTED
at org.eclipse.jetty.server.HttpChannelState.sendError(HttpChannelState.java:896)
at org.eclipse.jetty.server.Response.sendError(Response.java:471)
at com.cloudera.server.common.AgentAvroServlet.logAndSuppressException(AgentAvroServlet.java:113)
at com.cloudera.server.common.AgentAvroServlet.doPost(AgentAvroServlet.java:94)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:665)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:750)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:791)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:550)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1435)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1350)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at org.eclipse.jetty.server.Server.handle(Server.java:516)
at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388)
at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:273)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
at com.cloudera.server.common.MovingStats$1.get(MovingStats.java:32)
at com.cloudera.server.common.MovingStats$1.get(MovingStats.java:29)
at com.cloudera.server.common.MovingStats.measure(MovingStats.java:41)
at com.cloudera.server.common.MovingStats.measure(MovingStats.java:29)
at com.cloudera.server.common.MonitoringThreadPool$RunnableImpl.run(MonitoringThreadPool.java:135)
at com.cloudera.server.common.BoundedQueuedThreadPool$2.run(BoundedQueuedThreadPool.java:94)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:773)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:905)
at java.lang.Thread.run(Thread.java:748)
2025-01-10 10:59:43,039 INFO CommandPusher-1:com.cloudera.server.cmf.CommandPusherThread: Acquired lease lock on DbCommand:1546333691
2025-01-10 10:59:50,077 INFO CommandPusher-1:com.cloudera.server.cmf.CommandPusherThread: Acquired lease lock on DbCommand:1546333691
2025-01-10 10:59:54,001 WARN CommandPusher-1:com.cloudera.server.cmf.CommandPusherThread: Aborting command 1546333758 (ZkInit) because timeout value 90 seconds exceeded.
2025-01-10 10:59:54,002 INFO CommandPusher-1:com.cloudera.cmf.service.AbstractOneOffRoleCommand: Aborting 'ZkInit' command (1546333758) on service DbService{id=1546333542, name=zookeeper} role DbRole{id=1546333582, name=zookeeper-SERVER-5786924a54d072890728fdea10be12e0, hostName=prw-cld-trial.localdomain}.
2025-01-10 10:59:54,002 ERROR CommandPusher-1:com.cloudera.cmf.model.DbCommand: Command 1546333758(ZkInit) has completed. finalstate:CANCELLED, success:false, msg:Aborted command
2025-01-10 10:59:54,002 INFO CommandPusher-1:com.cloudera.cmf.command.components.CommandStorage: Invoked delete temp files for command:DbCommand{id=1546333758, name=ZkInit, role=zookeeper-SERVER-5786924a54d072890728fdea10be12e0} at dir:/var/lib/cloudera-scm-server/temp/commands/1546333758
2025-01-10 10:59:55,615 WARN CommandPusher-1:com.cloudera.server.cmf.CommandPusherThread: Aborting command 1546333760 (Format) because timeout value 90 seconds exceeded.
2025-01-10 10:59:55,616 INFO CommandPusher-1:com.cloudera.cmf.service.AbstractOneOffRoleCommand: Aborting 'Format' command (1546333760) on service DbService{id=1546333556, name=hdfs} role DbRole{id=1546333607, name=hdfs-NAMENODE-5786924a54d072890728fdea10be12e0, hostName=prw-cld-trial.localdomain}.
2025-01-10 10:59:55,616 ERROR CommandPusher-1:com.cloudera.cmf.model.DbCommand: Command 1546333760(Format) has completed. finalstate:CANCELLED, success:false, msg:Aborted command
2025-01-10 10:59:55,616 INFO CommandPusher-1:com.cloudera.cmf.command.components.CommandStorage: Invoked delete temp files for command:DbCommand{id=1546333760, name=Format, role=hdfs-NAMENODE-5786924a54d072890728fdea10be12e0} at dir:/var/lib/cloudera-scm-server/temp/commands/1546333760
2025-01-10 10:59:56,575 ERROR CommandPusher-1:com.cloudera.cmf.model.DbCommand: Command 1546333691(First Run) has completed. finalstate:FINISHED, success:false, msg:Completed only 2/4 steps. First failure: Completed only 0/1 steps. First failure: Command (Initialize (1546333758)) has failed
2025-01-10 10:59:56,577 INFO CommandPusher-1:com.cloudera.cmf.command.components.CommandStorage: Invoked delete temp files for command:DbCommand{id=1546333691, name=First Run} at dir:/var/lib/cloudera-scm-server/temp/commands/1546333691
2025-01-10 11:00:03,075 INFO JvmPauseMonitor:com.cloudera.enterprise.debug.JvmPauseMonitor: Detected pause in JVM or host machine (e.g. a stop the world GC, or JVM not scheduled): paused approximately 1818ms: no GCs detected.
2025-01-10 11:00:04,677 INFO com.cloudera.cmf.scheduler-1_Worker-1:com.cloudera.cmf.service.ServiceHandlerRegistry: Executing Global command GlobalPoolsRefresh BasicCmdArgs{scheduleId=1, scheduledTime=2025-01-10T08:00:00.000Z}.
2025-01-10 11:00:05,799 INFO com.cloudera.cmf.scheduler-1_Worker-1:com.cloudera.cmf.scheduler.CommandDispatcherJob: Skipping scheduled command 'GlobalPoolsRefresh' since it is a noop.
2025-01-10 11:00:38,311 INFO pool-6-thread-1:com.cloudera.server.cmf.components.CmServerStateSynchronizer: (64 skipped) Cleaned up
2025-01-10 11:00:39,534 INFO pool-6-thread-1:com.cloudera.server.cmf.components.CmServerStateSynchronizer: (64 skipped) Synced up
2025-01-10 11:02:13,589 WARN JvmPauseMonitor:com.cloudera.enterprise.debug.JvmPauseMonitor: Detected pause in JVM or host machine (e.g. a stop the world GC, or JVM not scheduled): paused approximately 88421ms: GC pool 'ParNew' had collection(s): count=1 time=0ms, GC pool 'ConcurrentMarkSweep' had collection(s): count=1 time=88347ms
2025-01-10 11:02:13,937 INFO pool-6-thread-1:com.cloudera.server.cmf.components.CmServerStateSynchronizer: (2 skipped) Synced up
2025-01-10 11:02:13,946 INFO pool-6-thread-1:com.cloudera.server.cmf.components.CmServerStateSynchronizer: (3 skipped) Cleaned up
2025-01-10 11:02:14,807 INFO avro-servlet-hb-processor-0:com.cloudera.server.common.AgentAvroServlet: (4 skipped) AgentAvroServlet: heartbeat processing stats: average=1824ms, min=17ms, max=92124ms.
2025-01-10 11:02:16,753 INFO JvmPauseMonitor:com.cloudera.enterprise.debug.JvmPauseMonitor: Detected pause in JVM or host machine (e.g. a stop the world GC, or JVM not scheduled): paused approximately 1646ms: GC pool 'ParNew' had collection(s): count=2 time=817ms
2025-01-10 11:02:27,708 INFO scm-web-371:com.cloudera.server.cmf.descriptor.DescriptorFragmentsCache: Not able to acquire lock for 10000 milliseconds when getting fragment configDefaults.
2025-01-10 11:03:00,658 WARN JvmPauseMonitor:com.cloudera.enterprise.debug.JvmPauseMonitor: Detected pause in JVM or host machine (e.g. a stop the world GC, or JVM not scheduled): paused approximately 26866ms: GC pool 'ParNew' had collection(s): count=1 time=0ms, GC pool 'ConcurrentMarkSweep' had collection(s): count=1 time=26804ms
services are hdfs, yarn, zookeeper, cloudera configurations
Created 01-13-2025 03:08 AM
Hello @Riyadbank
Thank you for reaching out to the community
1.) First thing: You can consider increasing the heap memory for CM server in /etc/default/cloudera-scm-server file and restart the cloudera-scm-server
2025-01-10 11:03:00,658 WARN JvmPauseMonitor:com.cloudera.enterprise.debug.JvmPauseMonitor: Detected pause in JVM or host machine (e.g. a stop the world GC, or JVM not scheduled): paused approximately 26866ms: GC pool 'ParNew' had collection(s): count=1 time=0ms, GC pool 'ConcurrentMarkSweep' had collection(s): count=1 time=26804ms2.) Regarding the actual issue Are the hosts heartbeating fine? Do you see any error in the stderr logs for those individual services?
 
					
				
				
			
		
