Support Questions

Find answers, ask questions, and share your expertise

worker uses more ram than it should

avatar
Rising Star


We have the following server to act as workers

2*6 cores (24 threads) 64 gb ram based on ambari 2.6.1.5
our process uses approx 1gb, for example when i submit 100 workers with the settings:
spark-submit ..... --executor-memory 2gb

the total ram used us 302 (100*3), because the ram usage is 3 gb, i cant fully use all the computation power, 3*24 >60 (i set the limit to 60) what did i miss?

both answers helped, each improved the ram usage

1 ACCEPTED SOLUTION

avatar

@ilia kheifets

The difference may come from yarn.scheduler.minimum-allocation-mb, spark memory overhead and jvm. For more information you may want to read the following article: https://blog.csdn.net/oufuji/article/details/50387104

HTH

*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.

View solution in original post

6 REPLIES 6

avatar

@ilia kheifets

The difference may come from yarn.scheduler.minimum-allocation-mb, spark memory overhead and jvm. For more information you may want to read the following article: https://blog.csdn.net/oufuji/article/details/50387104

HTH

*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.

avatar
Rising Star

setting yarn.scheduler.minimum-allocation-mb to smaller size improved the allocated memory by 30%

avatar
Super Collaborator

You may not be accounting for the driver RAM. Spark creates a driver process to act as a "parent" from which the executor processes spawn as separate YARN jobs. You are specifying the executor memory as 2GB but you did not specify the driver's memory limit.

By default, the driver is allocated 1GB of RAM thus explaining your calculations.

https://spark.apache.org/docs/latest/configuration.html

avatar
Rising Star

I have set it to 512M , it works.

when tried to go lower for example 128 i have got an error:

java.lang.IllegalArgumentException: System memory 119537664 must be at least 471859200. Please increase heap size using the --driver-memory option or spark.driver.memory in Spark configuration.

avatar
Super Collaborator

It looks like the application you've written uses almost 500 MB of driver memory. It sounds like your goal is to utilize all the CPU that your nodes carry - you'll have to either change the way your application works (to reduce the driver RAM) or reduce the executor memory to use all of the threads that your cluster offers.

avatar
Rising Star

it uses 300-1200mb, but you are right it cpu heavy. and i am trying to maximize the processing power