<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Spark job aborted due to java.lang.OutOfMemoryError: Java heap space in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Spark-job-aborted-due-to-java-lang-OutOfMemoryError-Java/m-p/217918#M179827</link>
    <description>&lt;P&gt;Hey &lt;A rel="user" href="https://community.cloudera.com/users/17036/priyal.html" nodeid="17036"&gt;@priyal patel&lt;/A&gt;!&lt;BR /&gt;Do you know how much is set for &lt;BR /&gt;&lt;/P&gt;&lt;PRE&gt;spark.driver.memoryOverhead&lt;BR /&gt;spark.executor.memoryOverhead&lt;/PRE&gt;&lt;P&gt;Also, do you mind to share your OOM error? &lt;/P&gt;</description>
    <pubDate>Mon, 11 Jun 2018 21:21:20 GMT</pubDate>
    <dc:creator>vmurakami</dc:creator>
    <dc:date>2018-06-11T21:21:20Z</dc:date>
    <item>
      <title>Spark job aborted due to java.lang.OutOfMemoryError: Java heap space</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Spark-job-aborted-due-to-java-lang-OutOfMemoryError-Java/m-p/217915#M179824</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I have Created HDP 2.6 on AWS with master node(m4.2xlarge) and 4 worker nodes(m4.xlarge).
I want to process 4GB log file using Spark job but i am getting below error while executing Spark Job : &lt;/P&gt;&lt;BLOCKQUOTE&gt;Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: 
Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOf(Arrays.java:3236) &lt;/BLOCKQUOTE&gt;&lt;P&gt;I have configured spark-env.sh file for master node :&lt;/P&gt;&lt;P&gt;SPARK_EXECUTOR_MEMORY="5G"&lt;/P&gt;&lt;P&gt;
SPARK_DRIVER_MEMORY="5G"&lt;/P&gt;&lt;P&gt; but it throws the same error.
I also configured worker nodes with those settings and increase Java heap size for hadoop client,Resource Manager,Node Manager and for YARN still spark job aborted.&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;</description>
      <pubDate>Fri, 08 Jun 2018 18:11:22 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Spark-job-aborted-due-to-java-lang-OutOfMemoryError-Java/m-p/217915#M179824</guid>
      <dc:creator>priyal</dc:creator>
      <dc:date>2018-06-08T18:11:22Z</dc:date>
    </item>
    <item>
      <title>Re: Spark job aborted due to java.lang.OutOfMemoryError: Java heap space</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Spark-job-aborted-due-to-java-lang-OutOfMemoryError-Java/m-p/217916#M179825</link>
      <description>&lt;P&gt;Hey &lt;A rel="user" href="https://community.cloudera.com/users/17036/priyal.html" nodeid="17036"&gt;@priyal patel&lt;/A&gt;!&lt;BR /&gt;Could you share your spark-submit parameters&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 09 Jun 2018 02:04:19 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Spark-job-aborted-due-to-java-lang-OutOfMemoryError-Java/m-p/217916#M179825</guid>
      <dc:creator>vmurakami</dc:creator>
      <dc:date>2018-06-09T02:04:19Z</dc:date>
    </item>
    <item>
      <title>Re: Spark job aborted due to java.lang.OutOfMemoryError: Java heap space</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Spark-job-aborted-due-to-java-lang-OutOfMemoryError-Java/m-p/217917#M179826</link>
      <description>&lt;P&gt;Hi, &lt;A rel="user" href="https://community.cloudera.com/users/79158/vmurakami.html" nodeid="79158"&gt;@Vinicius Higa Murakami&lt;/A&gt;&lt;/P&gt;&lt;P&gt;I want to process 4 GB file so I have configured executor memory to 10 gb and number of executors to 10 in spark-env.sh file.Here is the spark-submit parameters :&lt;/P&gt;&lt;BLOCKQUOTE&gt;./bin/spark-submit --class org.apache.TransformationOper --master local[2] /root/spark/TransformationOper.jar /Input/error.log&lt;/BLOCKQUOTE&gt;&lt;P&gt;I tried to set configuration manually using below spark-submit parameters :&lt;/P&gt;&lt;BLOCKQUOTE&gt;./bin/spark-submit  --driver-memory 5g --num-executors 10  --executor-memory 10g --class org.apache.TransformationOper --master local[2] /root/spark/TransformationOper.jar&lt;BR /&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;And set  master as a yarn-cluster still got the OutOfMemoryError error.&lt;/P&gt;</description>
      <pubDate>Mon, 11 Jun 2018 14:07:20 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Spark-job-aborted-due-to-java-lang-OutOfMemoryError-Java/m-p/217917#M179826</guid>
      <dc:creator>priyal</dc:creator>
      <dc:date>2018-06-11T14:07:20Z</dc:date>
    </item>
    <item>
      <title>Re: Spark job aborted due to java.lang.OutOfMemoryError: Java heap space</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Spark-job-aborted-due-to-java-lang-OutOfMemoryError-Java/m-p/217918#M179827</link>
      <description>&lt;P&gt;Hey &lt;A rel="user" href="https://community.cloudera.com/users/17036/priyal.html" nodeid="17036"&gt;@priyal patel&lt;/A&gt;!&lt;BR /&gt;Do you know how much is set for &lt;BR /&gt;&lt;/P&gt;&lt;PRE&gt;spark.driver.memoryOverhead&lt;BR /&gt;spark.executor.memoryOverhead&lt;/PRE&gt;&lt;P&gt;Also, do you mind to share your OOM error? &lt;/P&gt;</description>
      <pubDate>Mon, 11 Jun 2018 21:21:20 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Spark-job-aborted-due-to-java-lang-OutOfMemoryError-Java/m-p/217918#M179827</guid>
      <dc:creator>vmurakami</dc:creator>
      <dc:date>2018-06-11T21:21:20Z</dc:date>
    </item>
    <item>
      <title>Re: Spark job aborted due to java.lang.OutOfMemoryError: Java heap space</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Spark-job-aborted-due-to-java-lang-OutOfMemoryError-Java/m-p/217919#M179828</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/17036/priyal.html" nodeid="17036"&gt;@priyal patel&lt;/A&gt; First make sure you know if OOM is happening on driver or in executor. You can find this by looking at the logs. To test I suggest you increase the --driver-memory to 10g or even 20g and see what happens. Also try running on yarn-client mode instead of yarn-cluster. If OOM error comes on the sdtout of spark-submit you will know the driver is running out of memory. Else you can check the yarn logs -applicationId &amp;lt;appId&amp;gt; to see what happened on the executor side.&lt;/P&gt;&lt;P&gt;HTH&lt;/P&gt;&lt;P&gt;*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.&lt;/P&gt;</description>
      <pubDate>Tue, 12 Jun 2018 00:26:09 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Spark-job-aborted-due-to-java-lang-OutOfMemoryError-Java/m-p/217919#M179828</guid>
      <dc:creator>falbani</dc:creator>
      <dc:date>2018-06-12T00:26:09Z</dc:date>
    </item>
    <item>
      <title>Re: Spark job aborted due to java.lang.OutOfMemoryError: Java heap space</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Spark-job-aborted-due-to-java-lang-OutOfMemoryError-Java/m-p/217920#M179829</link>
      <description>&lt;P&gt;Hi, &lt;A rel="user" href="https://community.cloudera.com/users/79158/vmurakami.html" nodeid="79158"&gt;@Vinicius Higa Murakami&lt;/A&gt; ,  &lt;A rel="user" href="https://community.cloudera.com/users/11048/falbani.html" nodeid="11048"&gt;@Felix Albani&lt;/A&gt;&lt;/P&gt;&lt;P&gt;I have set spark.yarn.driver.memoryOverhead=1 GB,spark.yarn.executor.memoryOverhead=1 GB and spark_driver_memory=12 GB. I have set storage level to MEMORY_AND_DISK_SER(). &lt;/P&gt;&lt;P&gt;Hadoop Cluster configuration is : 1 Master node(r3.xlarge) and 1 worker node (m4.xlarge).&lt;/P&gt;&lt;P&gt;
Here is the spark-submit parameter :&lt;/P&gt;&lt;BLOCKQUOTE&gt;
./bin/spark-submit  --driver-memory 12g --executor-cores 2 --num-executors 3  --executor-memory 3g  --class org.apache.TransformationOper --master yarn-cluster /spark/TransformationOper.jar&lt;BR /&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;Spark job entered into running state but it has been executing for last one hour still execution not completed. &lt;/P&gt;</description>
      <pubDate>Wed, 13 Jun 2018 20:39:16 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Spark-job-aborted-due-to-java-lang-OutOfMemoryError-Java/m-p/217920#M179829</guid>
      <dc:creator>priyal</dc:creator>
      <dc:date>2018-06-13T20:39:16Z</dc:date>
    </item>
    <item>
      <title>Re: Spark job aborted due to java.lang.OutOfMemoryError: Java heap space</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Spark-job-aborted-due-to-java-lang-OutOfMemoryError-Java/m-p/217921#M179830</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/17036/priyal.html" nodeid="17036"&gt;@priyal patel&lt;/A&gt; Increasing driver memory seems to help then. If OOM issue is no longer happening then I recommend you open a separate thread for the performance issue. On any case to see why is taking long you can check the Spark UI and see what job/task is taking time and on which node. Then you can also review the logs for more information yarn logs -applicationId &amp;lt;appId&amp;gt;&lt;/P&gt;&lt;P&gt;HTH&lt;/P&gt;&lt;P&gt;*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.&lt;/P&gt;</description>
      <pubDate>Wed, 13 Jun 2018 20:51:15 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Spark-job-aborted-due-to-java-lang-OutOfMemoryError-Java/m-p/217921#M179830</guid>
      <dc:creator>falbani</dc:creator>
      <dc:date>2018-06-13T20:51:15Z</dc:date>
    </item>
    <item>
      <title>Re: Spark job aborted due to java.lang.OutOfMemoryError: Java heap space</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Spark-job-aborted-due-to-java-lang-OutOfMemoryError-Java/m-p/217922#M179831</link>
      <description>&lt;P&gt;Hi, &lt;A href="https://community.hortonworks.com/users/11048/falbani.html"&gt;@Felix Albani&lt;/A&gt;&lt;/P&gt;&lt;P&gt;I set driver memory to 20 GB.I tried using below spark-submit parameters :&lt;/P&gt;&lt;P&gt;./bin/spark-submit --driver-memory 20g --executor-cores 3 --num-executors 20 --executor-memory 2g --conf spark.yarn.executor.memoryOverhead=1024 --conf spark.yarn.driver.memoryOverhead=1024 --class org.apache.TransformationOper --master yarn-cluster /home/hdfs/priyal/spark/TransformationOper.jar&lt;/P&gt;&lt;P&gt;Cluster configuration is : 1 Master node(r3.xlarge) and 1 worker node(r3.xlarge) : 4 vCPUs, 30GB memory,40 GB storage&lt;/P&gt;&lt;P&gt;Still getting the same issue spark job is in running state and YARN memory is 95% used.&lt;/P&gt;</description>
      <pubDate>Thu, 14 Jun 2018 14:44:27 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Spark-job-aborted-due-to-java-lang-OutOfMemoryError-Java/m-p/217922#M179831</guid>
      <dc:creator>priyal</dc:creator>
      <dc:date>2018-06-14T14:44:27Z</dc:date>
    </item>
  </channel>
</rss>

