Support Questions

Find answers, ask questions, and share your expertise

Spark Thrift Server goes down/runs into OOM after some time when running large jobs from Tableau or beeline

avatar
New Contributor

Thrift server started as: /usr/hdp/current/spark-thriftserver/sbin/start-thriftserver.sh --master yarn-client --executor-memory 20G --num-executors 20 --executor-cores 12 --hiveconf hive.server2.thrift.port=10001

1. Cached data in spark memory by using Cache table bo_5years

2. ran select * from bo_5years from beeline

Error in logs: 
15/12/04 16:03:54 WARN DefaultChannelPipeline: An exception was thrown by a user handler while handling an exception event ([id: 0x5e66418a, /10.105.167.206:53903 => /10.105.164.205:60270] EXCEPTION: java.lang.OutOfMemoryError: Java heap 
space) 
java.lang.OutOfMemoryError: Java heap space 
        at java.lang.Object.clone(Native Method) 
        at akka.util.CompactByteString$.apply(ByteString.scala:410) 
        at akka.util.ByteString$.apply(ByteString.scala:22) 
        at akka.remote.transport.netty.TcpHandlers$class.onMessage(TcpSupport.scala:45) 
        at akka.remote.transport.netty.TcpServerHandler.onMessage(TcpSupport.scala:57) 
        at akka.remote.transport.netty.NettyServerHelpers$class.messageReceived(NettyHelpers.scala:43) 
        at akka.remote.transport.netty.ServerHandler.messageReceived(NettyTransport.scala:180) 
        at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296) 
        at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462) 
        at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443) 
        at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:310) 
        at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) 
        at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) 
        at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) 
        at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108) 
        at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318) 
        at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89) 
        at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) 
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
        at java.lang.Thread.run(Thread.java:745) 
15/12/04 16:03:56 ERROR ErrorMonitor: Uncaught fatal error from thread [sparkDriver-akka.remote.default-remote-dispatcher-7] shutting down ActorSystem [sparkDriver] 
java.lang.OutOfMemoryError: Java heap space 
        at org.spark_project.protobuf.ByteString.copyFrom(ByteString.java:192) 
        at org.spark_project.protobuf.CodedInputStream.readBytes(CodedInputStream.java:324) 
        at akka.remote.WireFormats$SerializedMessage.<init>(WireFormats.java:3030) 
        at akka.remote.WireFormats$SerializedMessage.<init>(WireFormats.java:2980) 
        at akka.remote.WireFormats$SerializedMessage$1.parsePartialFrom(WireFormats.java:3073) 
        at akka.remote.WireFormats$SerializedMessage$1.parsePartialFrom(WireFormats.java:3068) 
        at org.spark_project.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) 
        at akka.remote.WireFormats$RemoteEnvelope.<init>(WireFormats.java:993) 
        at akka.remote.WireFormats$RemoteEnvelope.<init>(WireFormats.java:927) 
        at akka.remote.WireFormats$RemoteEnvelope$1.parsePartialFrom(WireFormats.java:1049) 
        at akka.remote.WireFormats$RemoteEnvelope$1.parsePartialFrom(WireFormats.java:1044) 
        at org.spark_project.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) 
        at akka.remote.WireFormats$AckAndEnvelopeContainer.<init>(WireFormats.java:241) 
        at akka.remote.WireFormats$AckAndEnvelopeContainer.<init>(WireFormats.java:175) 
        at akka.remote.WireFormats$AckAndEnvelopeContainer$1.parsePartialFrom(WireFormats.java:279) 
        at akka.remote.WireFormats$AckAndEnvelopeContainer$1.parsePartialFrom(WireFormats.java:274) 
        at org.spark_project.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:141) 
        at org.spark_project.protobuf.AbstractParser.parseFrom(AbstractParser.java:176) 
        at org.spark_project.protobuf.AbstractParser.parseFrom(AbstractParser.java:188) 
        at org.spark_project.protobuf.AbstractParser.parseFrom(AbstractParser.java:193) 
        at org.spark_project.protobuf.AbstractParser.parseFrom(AbstractParser.java:49) 
        at akka.remote.WireFormats$AckAndEnvelopeContainer.parseFrom(WireFormats.java:409) 
        at akka.remote.transport.AkkaPduProtobufCodec$.decodeMessage(AkkaPduCodec.scala:181) 
        at akka.remote.EndpointReader.akka$remote$EndpointReader$$tryDecodeMessageAndAck(Endpoint.scala:995) 
        at akka.remote.EndpointReader$$anonfun$receive$2.applyOrElse(Endpoint.scala:928) 
        at akka.actor.Actor$class.aroundReceive(Actor.scala:465) 
        at akka.remote.EndpointActor.aroundReceive(Endpoint.scala:415) 
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) 
        at akka.actor.ActorCell.invoke(ActorCell.scala:487)
1 ACCEPTED SOLUTION

avatar
New Contributor

This got resolved by removing the executor-cores argument passed while starting thrift server. Memory and no of executors can be increased/decreased based on data volume.

View solution in original post

4 REPLIES 4

avatar
Master Mentor

avatar
New Contributor

Thanks Neeraj, for the post. We do cache data in the use case above so certain percentage would be needed for spark persistence. Tried the executor with 40G memory as well and ran into the issue.

avatar
Master Mentor

@Gagan Singh are you still having this issue? Can you post your solution? Otherwise please accept the answer to close out the thread.

avatar
New Contributor

This got resolved by removing the executor-cores argument passed while starting thrift server. Memory and no of executors can be increased/decreased based on data volume.