Support Questions

ChocoChoco · ‎06-16-2021

Hello

I've a mistake with my app in cluster mode with Spark. My app have 2 containers : one driver and one executor.

When the executor container is killed (by an error or what else), my app doesn't give an other container or kill the current attempt to reset the job. The job is like a zombie, the driver doesn't see it and continue without any error.

I missed maybe something about the yarn conf or spark-submit's params.

Have you ever meet this situation ?

ChocoChoco · ‎06-16-2021

Tested without dynamic allocation, a new container is spawning.

Is there an option with dynamic allocation like

--num-executors

but should be

--min-num-executors

spark.streaming.dynamicAllocation.minExecutors=1 seems to not work

Cloudera Community

Support Questions

When Container is killed, Yarn/Spark doesn't give me a new container and my app is like a Zombie