Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

What is to be stored in memory overhead in Spark?

Highlighted

What is to be stored in memory overhead in Spark?

New Contributor

I am working on large data volume as Spark is meant for.Recently I was facing Executor Lost exception and it resolved by increasing executor memoryOverhead. Can anyone help me to understand the internal functionality of memoryOverhead and what exactly memory given to memoryOverhead is utilized for? I understand the equation to derive memoryOverhead. But I am still black box to understand which objects are stored in this memory? Either those objects belongs to User Classes (Userdefined classes) or Spark own classes? In first attempt,the executors are getting lost and task is failed while in second attempt the task are completed successfully.Why this memory is dependent on my data volume? Only auto-resubmitting is not a solution.Its part of Spark goodness. Also let me know how to reduce these objects if this is the only problem due to which all this happened.

Don't have an account?
Coming from Hortonworks? Activate your account here