Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Spark Thrift Server goes down/runs into OOM after some time when running large jobs from Tableau or beeline

SOLVED Go to solution
Highlighted

Spark Thrift Server goes down/runs into OOM after some time when running large jobs from Tableau or beeline

New Contributor

Thrift server started as: /usr/hdp/current/spark-thriftserver/sbin/start-thriftserver.sh --master yarn-client --executor-memory 20G --num-executors 20 --executor-cores 12 --hiveconf hive.server2.thrift.port=10001

1. Cached data in spark memory by using Cache table bo_5years

2. ran select * from bo_5years from beeline

Error in logs: 
15/12/04 16:03:54 WARN DefaultChannelPipeline: An exception was thrown by a user handler while handling an exception event ([id: 0x5e66418a, /10.105.167.206:53903 => /10.105.164.205:60270] EXCEPTION: java.lang.OutOfMemoryError: Java heap 
space) 
java.lang.OutOfMemoryError: Java heap space 
        at java.lang.Object.clone(Native Method) 
        at akka.util.CompactByteString$.apply(ByteString.scala:410) 
        at akka.util.ByteString$.apply(ByteString.scala:22) 
        at akka.remote.transport.netty.TcpHandlers$class.onMessage(TcpSupport.scala:45) 
        at akka.remote.transport.netty.TcpServerHandler.onMessage(TcpSupport.scala:57) 
        at akka.remote.transport.netty.NettyServerHelpers$class.messageReceived(NettyHelpers.scala:43) 
        at akka.remote.transport.netty.ServerHandler.messageReceived(NettyTransport.scala:180) 
        at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296) 
        at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462) 
        at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443) 
        at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:310) 
        at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) 
        at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) 
        at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) 
        at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108) 
        at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318) 
        at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89) 
        at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) 
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
        at java.lang.Thread.run(Thread.java:745) 
15/12/04 16:03:56 ERROR ErrorMonitor: Uncaught fatal error from thread [sparkDriver-akka.remote.default-remote-dispatcher-7] shutting down ActorSystem [sparkDriver] 
java.lang.OutOfMemoryError: Java heap space 
        at org.spark_project.protobuf.ByteString.copyFrom(ByteString.java:192) 
        at org.spark_project.protobuf.CodedInputStream.readBytes(CodedInputStream.java:324) 
        at akka.remote.WireFormats$SerializedMessage.<init>(WireFormats.java:3030) 
        at akka.remote.WireFormats$SerializedMessage.<init>(WireFormats.java:2980) 
        at akka.remote.WireFormats$SerializedMessage$1.parsePartialFrom(WireFormats.java:3073) 
        at akka.remote.WireFormats$SerializedMessage$1.parsePartialFrom(WireFormats.java:3068) 
        at org.spark_project.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) 
        at akka.remote.WireFormats$RemoteEnvelope.<init>(WireFormats.java:993) 
        at akka.remote.WireFormats$RemoteEnvelope.<init>(WireFormats.java:927) 
        at akka.remote.WireFormats$RemoteEnvelope$1.parsePartialFrom(WireFormats.java:1049) 
        at akka.remote.WireFormats$RemoteEnvelope$1.parsePartialFrom(WireFormats.java:1044) 
        at org.spark_project.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) 
        at akka.remote.WireFormats$AckAndEnvelopeContainer.<init>(WireFormats.java:241) 
        at akka.remote.WireFormats$AckAndEnvelopeContainer.<init>(WireFormats.java:175) 
        at akka.remote.WireFormats$AckAndEnvelopeContainer$1.parsePartialFrom(WireFormats.java:279) 
        at akka.remote.WireFormats$AckAndEnvelopeContainer$1.parsePartialFrom(WireFormats.java:274) 
        at org.spark_project.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:141) 
        at org.spark_project.protobuf.AbstractParser.parseFrom(AbstractParser.java:176) 
        at org.spark_project.protobuf.AbstractParser.parseFrom(AbstractParser.java:188) 
        at org.spark_project.protobuf.AbstractParser.parseFrom(AbstractParser.java:193) 
        at org.spark_project.protobuf.AbstractParser.parseFrom(AbstractParser.java:49) 
        at akka.remote.WireFormats$AckAndEnvelopeContainer.parseFrom(WireFormats.java:409) 
        at akka.remote.transport.AkkaPduProtobufCodec$.decodeMessage(AkkaPduCodec.scala:181) 
        at akka.remote.EndpointReader.akka$remote$EndpointReader$$tryDecodeMessageAndAck(Endpoint.scala:995) 
        at akka.remote.EndpointReader$$anonfun$receive$2.applyOrElse(Endpoint.scala:928) 
        at akka.actor.Actor$class.aroundReceive(Actor.scala:465) 
        at akka.remote.EndpointActor.aroundReceive(Endpoint.scala:415) 
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) 
        at akka.actor.ActorCell.invoke(ActorCell.scala:487)
1 ACCEPTED SOLUTION

Accepted Solutions

Re: Spark Thrift Server goes down/runs into OOM after some time when running large jobs from Tableau or beeline

New Contributor

This got resolved by removing the executor-cores argument passed while starting thrift server. Memory and no of executors can be increased/decreased based on data volume.

4 REPLIES 4

Re: Spark Thrift Server goes down/runs into OOM after some time when running large jobs from Tableau or beeline

Re: Spark Thrift Server goes down/runs into OOM after some time when running large jobs from Tableau or beeline

New Contributor

Thanks Neeraj, for the post. We do cache data in the use case above so certain percentage would be needed for spark persistence. Tried the executor with 40G memory as well and ran into the issue.

Re: Spark Thrift Server goes down/runs into OOM after some time when running large jobs from Tableau or beeline

Mentor

@Gagan Singh are you still having this issue? Can you post your solution? Otherwise please accept the answer to close out the thread.

Re: Spark Thrift Server goes down/runs into OOM after some time when running large jobs from Tableau or beeline

New Contributor

This got resolved by removing the executor-cores argument passed while starting thrift server. Memory and no of executors can be increased/decreased based on data volume.