Support Questions

Find answers, ask questions, and share your expertise

Need Spark Thrift Server Design because STS hang after started about 2 hours

avatar
Explorer

Hi all,

I am running Spark Thrift Server on Yarn, client mode with 50 executor nodes. First I setup -Xmx=25g for driver, the STS run about 30 mins then hang. After that I increase -Xmx=40G for driver, the STS run about 1 hour then hang. I increase -Xmx=56G for driver, STS run about 2 hours then hang again. I could not keep increasing JVM heap. In all cases, I didn't see any out of memory exception in log file. It seems that when I increased JVM heap on driver STS took most of them. I dumped JVM heap and I saw SparkSession objects are biggest object (one of them is about 10G, others are about 4-6G). I don't understand why SparkSession objects are too large like that. Please:

1) Is there any suggestion to help me resolve my issue?

2) Is there any where I can research more about the way STS works. I need a document like https://cwiki.apache.org/confluence/display/Hive/Design#Design-HiveArchitecture to understand how STS process query from client because it seems that driver memory keep increasing when there are more people connect and query to STS

3) How can I sizing memory that need to configure proper for my driver in STS

Thank you very much,

14 REPLIES 14

avatar
Super Collaborator

@anobi do Did you try setting spark.sql.thriftServer.incrementalCollect true?

I am not running multiple queries at a time, so maybe because of that I'm not seeing this, Try decreasing number of simultaneous sessions after setting incremental to true.

avatar
Explorer

Yes @tsharma. I already set it. I have about 20 people use STS to query data everyday. Do you know how to restrict number of simultaneous sessions or restrict maximum memory used per query?

avatar
Super Collaborator

Good to hear that anobi. I could not find how to restrict sessions to a particular value.

However if you set this spark.sql.hive.thriftServer.singleSession true.

Only 1 session can be run. This doesn't scale very well.

Please run spark.conf.getAll(), you may find other properties related to num sessions.

Also please accept/upvote any answers if they helped you in concept.

Thank You

avatar
Explorer

Thank you very much for your response @tsharma. I do not use Ambari for STS. I will follow your suggestions

avatar
Explorer

Hi @tsharma,

Thank you very much for your support. I changed memory to -Xmx=64g and it seems to resolved my issue. My STS is running about 27 hours for now. I will keep monitoring too see if the problem is resolved permanently. I used to setup -Xmx=25G, then 40G then 56G but STS run for a while and hang. I still do NOT know how to calculate memory needed for STS. I have about 20 user simultaneous.