- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Why do we need to specify executor cores for Spark Applications?
- Labels:
-
Apache Spark
-
Apache YARN
Created ‎11-04-2018 08:52 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi ,
I need to know why do specify executor cores for spark application running on yarn. Lets say we have a cluster with below mentioned specs:
Number of worker nodes : 4 (50 GB memory and 8 Cores each)
One Master Node of 50 GB memory and 8 cores.
Now lets consider,
a) if I have specified 8 num_executors for an application and I dont set executor-cores so will each executor going to use all the cores ?
b) As each node has 8 cores, so what if I specify executor_cores = 4 so that means it will limit executor cores to be used for an executor should not exceed 4 while total cores per node are 8 ?
c) What is the criteria to specify executor_cores for a spark application?
Created ‎11-05-2018 12:25 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Each executor can have 1 or more threads to perform parallel computation. In yarn master mode the default is 1. These threads can be increased by using the command line parameter --executor-cores.
a) if I have specified 8 num_executors for an application and I dont set executor-cores so will each executor going to use all the cores ?
In yarn master mode the default is 1, therefore each executor will use only 1 core by default.
b) As each node has 8 cores, so what if I specify executor_cores = 4 so that means it will limit executor cores to be used for an executor should not exceed 4 while total cores per node are 8 ?
Assignment of cores is static, not dynamic. And it will remain same during the execution of the application. If you set executor cores to 4 this means that each executor will start and run using 4 cores/threads to perform parallel computation.
c) What is the criteria to specify executor_cores for a spark application?
Increasing executor cores affects performance. You need to take in consideration the number of virtual cores available in each node and my recommendation is you should not increase this over 4 in most cases.
HTH
*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
