About anobido

anobido · ‎11-17-2017

Hi @tsharma, Thank you very much for your support. I changed memory to -Xmx=64g and it seems to resolved my issue. My STS is running about 27 hours for now. I will keep monitoring too see if the problem is resolved permanently. I used to setup -Xmx=25G, then 40G then 56G but STS run for a while and hang. I still do NOT know how to calculate memory needed for STS. I have about 20 user simultaneous.

anobido · ‎11-17-2017

Yes @tsharma. I already set it. I have about 20 people use STS to query data everyday. Do you know how to restrict number of simultaneous sessions or restrict maximum memory used per query?

anobido · ‎11-16-2017

Thank you very much for your response @tsharma. I do not use Ambari for STS. I will follow your suggestions

anobido · ‎11-16-2017

Thank you very much for your response @tsharma. I do not use HDP for my STS. I will follow your suggestion. I am wondering how did you calculate memory need for your cluster? Do you have any guideline plz. As you can see in my above log message, I already set memory to 48G but it seems take all my memory, if I increase it, it take all memory again ([Eden: 0.0B(2432.0M)->0.0B(2432.0M) Survivors: 0.0B->0.0B Heap: 47.5G(48.0G)->47.5G(48.0G)]) Thanks,

anobido · ‎11-16-2017

Hi all and @tsharma, I didn't see OMM exception in STS log file. However when I added "-XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintAdaptiveSizePolicy -XX:+PrintTenuringDistribution", I saw this message in gc log file "G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap already fully expanded" (please see detail message below). It seems the memory is not enough, but when I increased -Xmx, STS only work a little more time and hang again. Back to my previous questions: 1) What is kept in driver memory? Why it is too large (48G) and more if I increase -Xmx? As @tsharma said STS only a gateway. I am using client mode (not cluster mode) 2) How can I sizing memory that need to configure proper for my driver in STS? 3) I need a document like https://cwiki.apache.org/confluence/display/Hive/Design#Design-HiveArchitecture to understand how STS process query from client because it seems that driver memory keep increasing when there are more people connect and query to STS Thank you, My gc log: 2017-11-16T08:32:23.876+0700: 46776.282: [GC pause (G1 Evacuation Pause) (young) Desired survivor size 167772160 bytes, new threshold 15 (max 15) 46776.282: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 0, predicted base time: 13.63 ms, remaining time: 186.37 ms, target pause time: 200.00 ms] 46776.282: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 0 regions, survivors: 0 regions, predicted young region time: 0.00 ms] 46776.282: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 0 regions, survivors: 0 regions, old: 0 regions, predicted pause time: 13.63 ms, target pause time: 200.00 ms] 46776.289: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: recent GC overhead higher than threshold after GC, recent GC overhead: 97.41 %, threshold: 10.00 %, uncommitted: 0 bytes, calculated expansion amount: 0 bytes (20.00 %)] , 0.0069342 secs] [Parallel Time: 3.6 ms, GC Workers: 33] [GC Worker Start (ms): Min: 46776282.1, Avg: 46776282.4, Max: 46776282.7, Diff: 0.6] [Ext Root Scanning (ms): Min: 1.6, Avg: 2.0, Max: 3.2, Diff: 1.7, Sum: 64.4] [SATB Filtering (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] [Processed Buffers: Min: 0, Avg: 0.0, Max: 1, Diff: 1, Sum: 1] [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [Object Copy (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 1.3] [Termination (ms): Min: 0.0, Avg: 1.0, Max: 1.1, Diff: 1.1, Sum: 34.4] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.4] [GC Worker Total (ms): Min: 2.7, Avg: 3.0, Max: 3.3, Diff: 0.6, Sum: 100.5] [GC Worker End (ms): Min: 46776285.4, Avg: 46776285.4, Max: 46776285.4, Diff: 0.1] [Code Root Fixup: 0.7 ms] [Code Root Purge: 0.0 ms] [Clear CT: 0.6 ms] [Other: 2.0 ms] [Choose CSet: 0.0 ms] [Ref Proc: 1.1 ms] [Ref Enq: 0.0 ms] [Redirty Cards: 0.6 ms] [Humongous Reclaim: 0.0 ms] [Free CSet: 0.0 ms] [Eden: 0.0B(2432.0M)->0.0B(2432.0M) Survivors: 0.0B->0.0B Heap: 47.5G(48.0G)->47.5G(48.0G)] [Times: user=0.10 sys=0.00, real=0.01 secs] 46776.290: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: allocation request failed, allocation request: 32 bytes] 46776.290: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 33554432 bytes, attempted expansion amount: 33554432 bytes] 46776.290: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap already fully expanded] 2017-11-16T08:32:23.884+0700: 46776.290: [Full GC (Allocation Failure)

anobido · ‎11-14-2017

@tsharma: thank you. I will try to configure my system follow your suggestion

anobido · ‎11-14-2017

@tsharma: thank you very much for your response. Base on your suggestion, I googled and see this parameter seems to be setup for Spark Standalone Mode (https://spark.apache.org/docs/2.0.2/spark-standalone.html). My application is running on Yarn. Should I configure this parameter? Thanks,

anobido · ‎11-13-2017

Hi all, I am running Spark Thrift Server on Yarn, client mode with 50 executor nodes. First I setup -Xmx=25g for driver, the STS run about 30 mins then hang. After that I increase -Xmx=40G for driver, the STS run about 1 hour then hang. I increase -Xmx=56G for driver, STS run about 2 hours then hang again. I could not keep increasing JVM heap. In all cases, I didn't see any out of memory exception in log file. It seems that when I increased JVM heap on driver STS took most of them. I dumped JVM heap and I saw SparkSession objects are biggest object (one of them is about 10G, others are about 4-6G). I don't understand why SparkSession objects are too large like that. Please: 1) Is there any suggestion to help me resolve my issue? 2) Is there any where I can research more about the way STS works. I need a document like https://cwiki.apache.org/confluence/display/Hive/Design#Design-HiveArchitecture to understand how STS process query from client because it seems that driver memory keep increasing when there are more people connect and query to STS 3) How can I sizing memory that need to configure proper for my driver in STS Thank you very much,

Online	Offline
Last Visited	‎11-17-2017 04:40 AM

Member Since	‎11-12-2017 04:49 PM
Last Visited	‎11-17-2017 04:40 AM
Posts	8
Kudos received	1

Cloudera Community

Re: Need Spark Thrift Server Design because STS ha...

Re: Need Spark Thrift Server Design because STS ha...

Re: Need Spark Thrift Server Design because STS ha...

Re: Need Spark Thrift Server Design because STS ha...

Re: Need Spark Thrift Server Design because STS ha...

Re: Need Spark Thrift Server Design because STS ha...

Re: Need Spark Thrift Server Design because STS ha...

Need Spark Thrift Server Design because STS hang a...