About dziduchm

lee46 · ‎11-19-2018

Hi @srowen I am using CDH 5.15.1 and running the spark-submit to train the model and save the prediction dataframe of the model to HDFS. I am facing this errors when I am trying to save the dataframe to HDFS, 2018-11-19 11:17:33 ERROR YarnClusterScheduler:70 - Lost executor 2 on gworker6.vcse.lab: Executor heartbeat timed out after 149836 ms 2018-11-19 11:17:33 ERROR YarnClusterScheduler:70 - Lost executor 2 on gworker6.vcse.lab: Executor heartbeat timed out after 149836 ms 2018-11-19 11:18:07 ERROR YarnClusterScheduler:70 - Lost executor 2 on gworker6.vcse.lab: Container container_1542123439491_0080_01_000004 exited from explicit termination request. 2018-11-19 11:18:07 ERROR YarnClusterScheduler:70 - Lost executor 2 on gworker6.vcse.lab: Container container_1542123439491_0080_01_000004 exited from explicit termination request. I have also tried using the spark.yarn.executor.memoryOverhead which I have set that to 10% of the executor-memory mentioned in my spark-submit and still I am seeing this errors. Do you have any suggestions for this issue? Spark-Submit Command: spark-submit-with-zoo.sh --master yarn --deploy-mode cluster --num-executors 8 --executor-cores 16 --driver-memory 300g --executor-memory 400g Main_Final_auc.py 256

Online	Offline
Last Visited	‎12-15-2017 08:25 AM

Member Since	‎11-22-2017 04:54 AM
Last Visited	‎12-15-2017 08:25 AM
Posts	3

Cloudera Community

Re: ExecutorLostFailure Reason: Container killed ...