<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: ExecutorLostFailure  Reason: Container killed by YARN for exceeding memory limits in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/ExecutorLostFailure-Reason-Container-killed-by-YARN-for/m-p/82579#M21249</link>
    <description>&lt;P&gt;Hi &lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/6698"&gt;@srowen&lt;/a&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am using CDH 5.15.1 and running the spark-submit to train the model and save the prediction dataframe of the model to HDFS. I am facing this errors when I am trying to save the dataframe to HDFS,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;2018-11-19 11:17:33 ERROR YarnClusterScheduler:70 - Lost executor 2 on gworker6.vcse.lab: Executor heartbeat timed out after 149836 ms
2018-11-19 11:17:33 ERROR YarnClusterScheduler:70 - Lost executor 2 on gworker6.vcse.lab: Executor heartbeat timed out after 149836 ms
2018-11-19 11:18:07 ERROR YarnClusterScheduler:70 - Lost executor 2 on gworker6.vcse.lab: Container container_1542123439491_0080_01_000004 exited from explicit termination request.
2018-11-19 11:18:07 ERROR YarnClusterScheduler:70 - Lost executor 2 on gworker6.vcse.lab: Container container_1542123439491_0080_01_000004 exited from explicit termination request.&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have also tried using the spark.yarn.executor.memoryOverhead which I have set that to 10% of the executor-memory mentioned in my spark-submit and still I am seeing this errors. Do you have any suggestions for this issue?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Spark-Submit Command&lt;/U&gt;:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;spark-submit-with-zoo.sh --master yarn --deploy-mode cluster --num-executors 8 --executor-cores 16 --driver-memory 300g --executor-memory 400g Main_Final_auc.py 256&lt;/P&gt;</description>
    <pubDate>Mon, 19 Nov 2018 17:40:24 GMT</pubDate>
    <dc:creator>lee46</dc:creator>
    <dc:date>2018-11-19T17:40:24Z</dc:date>
  </channel>
</rss>

