We have a Spark 2.2 job written in Scala and running in a YARN cluster that does the following:
The following configuration fails via java.lang.OutOfMemory java heap space:
However, this job works reliably if we remove spark.executor.memory entirely. This gives each executor 1g of ram.
This job also fails if we do any of the following:
Can anyone help me understand why more memory and more executors leads to failed jobs due to OutOfMemory?