Support Questions
Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

Spark Thrift Server goes down/runs into OOM after some time when running large jobs from Tableau or beeline

New Contributor

Thrift server started as: /usr/hdp/current/spark-thriftserver/sbin/start-thriftserver.sh --master yarn-client --executor-memory 20G --num-executors 20 --executor-cores 12 --hiveconf hive.server2.thrift.port=10001

1. Cached data in spark memory by using Cache table bo_5years

2. ran select * from bo_5years from beeline

Error in logs: 
15/12/04 16:03:54 WARN DefaultChannelPipeline: An exception was thrown by a user handler while handling an exception event ([id: 0x5e66418a, /10.105.167.206:53903 => /10.105.164.205:60270] EXCEPTION: java.lang.OutOfMemoryError: Java heap 
space) 
java.lang.OutOfMemoryError: Java heap space 
        at java.lang.Object.clone(Native Method) 
        at akka.util.CompactByteString$.apply(ByteString.scala:410) 
        at akka.util.ByteString$.apply(ByteString.scala:22) 
        at akka.remote.transport.netty.TcpHandlers$class.onMessage(TcpSupport.scala:45) 
        at akka.remote.transport.netty.TcpServerHandler.onMessage(TcpSupport.scala:57) 
        at akka.remote.transport.netty.NettyServerHelpers$class.messageReceived(NettyHelpers.scala:43) 
        at akka.remote.transport.netty.ServerHandler.messageReceived(NettyTransport.scala:180) 
        at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296) 
        at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462) 
        at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443) 
        at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:310) 
        at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) 
        at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) 
        at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) 
        at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108) 
        at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318) 
        at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89) 
        at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) 
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
        at java.lang.Thread.run(Thread.java:745) 
15/12/04 16:03:56 ERROR ErrorMonitor: Uncaught fatal error from thread [sparkDriver-akka.remote.default-remote-dispatcher-7] shutting down ActorSystem [sparkDriver] 
java.lang.OutOfMemoryError: Java heap space 
        at org.spark_project.protobuf.ByteString.copyFrom(ByteString.java:192) 
        at org.spark_project.protobuf.CodedInputStream.readBytes(CodedInputStream.java:324) 
        at akka.remote.WireFormats$SerializedMessage.<init>(WireFormats.java:3030) 
        at akka.remote.WireFormats$SerializedMessage.<init>(WireFormats.java:2980) 
        at akka.remote.WireFormats$SerializedMessage$1.parsePartialFrom(WireFormats.java:3073) 
        at akka.remote.WireFormats$SerializedMessage$1.parsePartialFrom(WireFormats.java:3068) 
        at org.spark_project.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) 
        at akka.remote.WireFormats$RemoteEnvelope.<init>(WireFormats.java:993) 
        at akka.remote.WireFormats$RemoteEnvelope.<init>(WireFormats.java:927) 
        at akka.remote.WireFormats$RemoteEnvelope$1.parsePartialFrom(WireFormats.java:1049) 
        at akka.remote.WireFormats$RemoteEnvelope$1.parsePartialFrom(WireFormats.java:1044) 
        at org.spark_project.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) 
        at akka.remote.WireFormats$AckAndEnvelopeContainer.<init>(WireFormats.java:241) 
        at akka.remote.WireFormats$AckAndEnvelopeContainer.<init>(WireFormats.java:175) 
        at akka.remote.WireFormats$AckAndEnvelopeContainer$1.parsePartialFrom(WireFormats.java:279) 
        at akka.remote.WireFormats$AckAndEnvelopeContainer$1.parsePartialFrom(WireFormats.java:274) 
        at org.spark_project.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:141) 
        at org.spark_project.protobuf.AbstractParser.parseFrom(AbstractParser.java:176) 
        at org.spark_project.protobuf.AbstractParser.parseFrom(AbstractParser.java:188) 
        at org.spark_project.protobuf.AbstractParser.parseFrom(AbstractParser.java:193) 
        at org.spark_project.protobuf.AbstractParser.parseFrom(AbstractParser.java:49) 
        at akka.remote.WireFormats$AckAndEnvelopeContainer.parseFrom(WireFormats.java:409) 
        at akka.remote.transport.AkkaPduProtobufCodec$.decodeMessage(AkkaPduCodec.scala:181) 
        at akka.remote.EndpointReader.akka$remote$EndpointReader$$tryDecodeMessageAndAck(Endpoint.scala:995) 
        at akka.remote.EndpointReader$$anonfun$receive$2.applyOrElse(Endpoint.scala:928) 
        at akka.actor.Actor$class.aroundReceive(Actor.scala:465) 
        at akka.remote.EndpointActor.aroundReceive(Endpoint.scala:415) 
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) 
        at akka.actor.ActorCell.invoke(ActorCell.scala:487)
1 ACCEPTED SOLUTION

New Contributor

This got resolved by removing the executor-cores argument passed while starting thrift server. Memory and no of executors can be increased/decreased based on data volume.

View solution in original post

4 REPLIES 4

New Contributor

Thanks Neeraj, for the post. We do cache data in the use case above so certain percentage would be needed for spark persistence. Tried the executor with 40G memory as well and ran into the issue.

Mentor

@Gagan Singh are you still having this issue? Can you post your solution? Otherwise please accept the answer to close out the thread.

New Contributor

This got resolved by removing the executor-cores argument passed while starting thrift server. Memory and no of executors can be increased/decreased based on data volume.