Reply
Contributor
Posts: 43
Registered: ‎03-04-2015

ExecutorLostFailure Reason: Container killed by YARN for exceeding memory limits

Hi

I am using cloudera 5.7.0 . and running spark streaming application using kafka which doing some opencv operation .

some of my containers killed by Yarn with below reason :
ExecutorLostFailure (executor 1 exited caused by one of the running tasks) Reason: Container killed by YARN for exceeding memory limits. 3.1 GB of 3 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead

i am using below configuarion .
spark-submit --num-executors 20 --executor-memory 2g --executor-cores 2 - --conf spark.yarn.executor.memoryOverhead=1000
.

how can i solve this issue

Regards
Prateek

Cloudera Employee
Posts: 481
Registered: ‎08-11-2014

Re: ExecutorLostFailure Reason: Container killed by YARN for exceeding memory limits

This means the JVM took more memory than YARN thought it should. Usually this means you need to allocate more overhead, so that more memory is requested from YARN for the same size of JVM heap. See the spark.yarn.executor.memoryOverhead option, which defaults to 10% of the specified executor memory. Increase it.

New Contributor
Posts: 3
Registered: ‎11-22-2017

Re: ExecutorLostFailure Reason: Container killed by YARN for exceeding memory limits

[ Edited ]

Hey,

I am having the same issues. 

 

Spark 1.6

Cloudera Express 5.7.1

 

 

ExecutorLostFailure (executor 60 exited caused by one of the running tasks) 

Reason: Container killed by YARN for exceeding memory limits. 1.5 GB of 1.5 GB physical memory used. 

Consider boosting spark.yarn.executor.memoryOverhead.

 

I see your solution but cannot find where that is in CM.

Can you please point me where that option is in Cloudera Manager UI?

 

Thanks,

 

Marcin

Cloudera Employee
Posts: 481
Registered: ‎08-11-2014

Re: ExecutorLostFailure Reason: Container killed by YARN for exceeding memory limits

This has nothing to do with CM. It has to do with your app's memory configuration. The relevant settings are right there in the error.

New Contributor
Posts: 3
Registered: ‎11-22-2017

Re: ExecutorLostFailure Reason: Container killed by YARN for exceeding memory limits

Okay,

 

So how can I increase the overhead in Jupyter Notebook?

I am not using spark-submit for this job.

And how could I find out, what are current overhead settings?

 

Thanks!

Cloudera Employee
Posts: 481
Registered: ‎08-11-2014

Re: ExecutorLostFailure Reason: Container killed by YARN for exceeding memory limits

I'm not sure how you would do that. We support spark-submit and the Workbench, not Jupyter. It's clear how to configure spark-submit, and you configure the workbench with spark-defaults.conf. You can see your Spark job's config in its UI, in the environment tab.

New Contributor
Posts: 3
Registered: ‎11-22-2017

Re: ExecutorLostFailure Reason: Container killed by YARN for exceeding memory limits

Thanks!

 

spark-submit script fixed the problem!

Highlighted
New Contributor
Posts: 1
Registered: ‎11-19-2018

Re: ExecutorLostFailure Reason: Container killed by YARN for exceeding memory limits

Hi @srowen

 

I am using CDH 5.15.1 and running the spark-submit to train the model and save the prediction dataframe of the model to HDFS. I am facing this errors when I am trying to save the dataframe to HDFS,

 

2018-11-19 11:17:33 ERROR YarnClusterScheduler:70 - Lost executor 2 on gworker6.vcse.lab: Executor heartbeat timed out after 149836 ms
2018-11-19 11:17:33 ERROR YarnClusterScheduler:70 - Lost executor 2 on gworker6.vcse.lab: Executor heartbeat timed out after 149836 ms
2018-11-19 11:18:07 ERROR YarnClusterScheduler:70 - Lost executor 2 on gworker6.vcse.lab: Container container_1542123439491_0080_01_000004 exited from explicit termination request.
2018-11-19 11:18:07 ERROR YarnClusterScheduler:70 - Lost executor 2 on gworker6.vcse.lab: Container container_1542123439491_0080_01_000004 exited from explicit termination request.

 

I have also tried using the spark.yarn.executor.memoryOverhead which I have set that to 10% of the executor-memory mentioned in my spark-submit and still I am seeing this errors. Do you have any suggestions for this issue?

 

Spark-Submit Command:

spark-submit-with-zoo.sh --master yarn --deploy-mode cluster --num-executors 8 --executor-cores 16 --driver-memory 300g --executor-memory 400g Main_Final_auc.py 256

Announcements