<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Spark Job long GC pauses in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Spark-Job-long-GC-pauses/m-p/282805#M210208</link>
    <description>&lt;P&gt;Hi &lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/23013"&gt;w@leed&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks for Replying.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I did test the Job with all the three Collectors - ParallelGC, CMS and G1GC:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I has tested following options with the G1GC:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;TABLE border="1"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;&lt;SPAN&gt;-XX:+UseG1GC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps&lt;/SPAN&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;and with CMS:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;&lt;TABLE border="1"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;&lt;SPAN&gt;-XX:+UseConcMarkSweepGC&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;-XX:+PrintGCTimeStamps&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;-XX:CMSInitiatingOccupancyFraction=70 -XX:+UseParNewGC&amp;nbsp;&amp;nbsp;-XX:+CMSConcurrentMTEnabled&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;-XX:ParallelCMSThreads=10&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;-XX:ConcGCThreads=8 -XX:ParallelGCThreads=16&lt;/SPAN&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;SPAN&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&lt;SPAN&gt;With G1GC defaults, I could see following: &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&lt;SPAN&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;TABLE border="1"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;Desired survivor size 1041235968 bytes, new threshold 5 (max 15)&lt;BR /&gt;[PSYoungGen: 1515304K-&amp;gt;782022K(3053056K)] 2750361K-&amp;gt;2017087K(6371840K), 1.5875321 secs] [Times: user=4.72 sys=0.74, real=1.59 secs]&lt;BR /&gt;Heap after GC invocations=9 (full 3):&lt;BR /&gt;PSYoungGen total 3053056K, used 782022K [0x0000000580000000, 0x000000068ef80000, 0x0000000800000000)&lt;BR /&gt;eden space 2270720K, 0% used [0x0000000580000000,0x0000000580000000,0x000000060a980000)&lt;BR /&gt;from space 782336K, 99% used [0x000000065f380000,0x000000068ef31ab0,0x000000068ef80000)&lt;BR /&gt;to space 1016832K, 0% used [0x0000000612d80000,0x0000000612d80000,0x0000000650e80000)&lt;BR /&gt;ParOldGen total 3318784K, used 1235064K [0x0000000080000000, 0x000000014a900000, 0x0000000580000000)&lt;BR /&gt;object space 3318784K, 37% used [0x0000000080000000,0x00000000cb61e318,0x000000014a900000)&lt;BR /&gt;Metaspace used 55055K, capacity 55638K, committed 55896K, reserved 1097728K&lt;BR /&gt;class space used 7049K, capacity 7207K, committed 7256K, reserved 1048576K&lt;BR /&gt;}&lt;BR /&gt;{Heap before GC invocations=10 (full 3):&lt;BR /&gt;PSYoungGen total 3053056K, used 3052742K [0x0000000580000000, 0x000000068ef80000, 0x0000000800000000)&lt;BR /&gt;eden space 2270720K, 100% used [0x0000000580000000,0x000000060a980000,0x000000060a980000)&lt;BR /&gt;from space 782336K, 99% used [0x000000065f380000,0x000000068ef31ab0,0x000000068ef80000)&lt;BR /&gt;to space 1016832K, 0% used [0x0000000612d80000,0x0000000612d80000,0x0000000650e80000)&lt;BR /&gt;ParOldGen total 3318784K, used 1235064K [0x0000000080000000, 0x000000014a900000, 0x0000000580000000)&lt;BR /&gt;object space 3318784K, 37% used [0x0000000080000000,0x00000000cb61e318,0x000000014a900000)&lt;BR /&gt;Metaspace used 55108K, capacity 55702K, committed 55896K, reserved 1097728K&lt;BR /&gt;class space used 7049K, capacity 7207K, committed 7256K, reserved 1048576K&lt;BR /&gt;42.412: [GC (Allocation Failure)&lt;BR /&gt;Desired survivor size 1653080064 bytes, new threshold 4 (max 15)&lt;BR /&gt;[PSYoungGen: 3052742K-&amp;gt;1016800K(3422720K)] 4287807K-&amp;gt;2985385K(6741504K), 4.0304873 secs] [Times: user=11.87 sys=1.77, real=4.03 secs]&lt;BR /&gt;Heap after GC invocations=10 (full 3):&lt;BR /&gt;PSYoungGen total 3422720K, used 1016800K [0x0000000580000000, 0x0000000727a80000, 0x0000000800000000)&lt;BR /&gt;eden space 2405888K, 0% used [0x0000000580000000,0x0000000580000000,0x0000000612d80000)&lt;BR /&gt;from space 1016832K, 99% used [0x0000000612d80000,0x0000000650e78240,0x0000000650e80000)&lt;BR /&gt;to space 1614336K, 0% used [0x00000006c5200000,0x00000006c5200000,0x0000000727a80000)&lt;BR /&gt;ParOldGen total 3318784K, used 1968584K [0x0000000080000000, 0x000000014a900000, 0x0000000580000000)&lt;BR /&gt;object space 3318784K, 59% used [0x0000000080000000,0x00000000f8272318,0x000000014a900000)&lt;BR /&gt;Metaspace used 55108K, capacity 55702K, committed 55896K, reserved 1097728K&lt;BR /&gt;class space used 7049K, capacity 7207K, committed 7256K, reserved 1048576K&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;With all the Collectors only difference I could see was that, a delayed full GC.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;I am considering to changing the YoungGen now. Will update if I do see a difference.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;On a parallel note -&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;1. I did also see that there are some of the objects in the memory which remain persistent across GC cycles - for example : scala.Tuple2 and java.lang.Long&amp;nbsp;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;2. These are Java RDD's&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Regards&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;</description>
    <pubDate>Wed, 13 Nov 2019 00:45:14 GMT</pubDate>
    <dc:creator>PARTOMIA09</dc:creator>
    <dc:date>2019-11-13T00:45:14Z</dc:date>
  </channel>
</rss>

