I'm running a Spark application under yarn cluster.
I have 250GB available in my one-node cluster and try to fully utilize it.
I've set for each executor the following:
spark.executor.memory 28672M (= 28G )
spark.yarn.executor.memoryOverhead 2048 (approx 7%)
I expected to see by monitoring with "top" that each executor is utilizing the allocated memory. However, I found that the resident memory is use is ~10GB and the virtual memory is ~30GB.
The yarn log (/var/log/hadoop-yarn/yarn/yarn-yarn-nodemanager-host.log) say the same:
2018-03-21 15:09:49,838 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(464)) - Memory usage of ProcessTree 279993 for container-id container_e09_1521632768675_0005_01_000007: 11.0 GB of 32 GB physical memory used; 31.5 GB of 67.2 GB virtual memory used
I repeated few times with various spark.executor.memory settings. In all my tests, the resident memory in use was less than 40% of my settings, as in the example above.
How can I utilize the entire 32GB that I've allocated ??