Member since
05-04-2016
2
Posts
0
Kudos Received
0
Solutions
05-04-2016
05:16 PM
Thanks. What blows my mine is this statement from the article OVERHEAD = max(SPECIFIED_MEMORY * 0.07, 384M) If I'm allocating 8GB for memoryOverhead, then OVERHEAD = 567 MB !! What is yarn using the other 7.5 GB for?
... View more
05-04-2016
04:44 PM
Does anyone know exactly what spark.yarn.executor.memoryOverhead is used for and why it may be using up so much space? If I could, I would love to have a peek inside this stack. Spark's description is as follows: The amount of off-heap memory (in megabytes) to be allocated per executor. This is memory that accounts for things like VM overheads, interned strings, other native overheads, etc. This tends to grow with the executor size (typically 6-10%). The problem I'm having is when running spark queries on large datasets ( > 5TB), I am required to set the executor memoryOverhead to 8GB otherwise it would throw an exception and die. What is being stored in this container that it needs 8GB per container? I've also noticed that this error doesn't occur on standalone mode, because it doesn't use YARN. Note. My configurations for this job are: executor memory = 15G
executor cores = 5
yarn.executor.memoryOverhead = 8GB
max executors = 60
offHeap.enabled = false
... View more
Labels:
- Labels:
-
Apache Spark