This applies to YARN mode. The --num-executors defines the number of executors, which really defines the total number of applications that will be run. You can specify the --executor-cores which defines how many CPU cores are available per executor/application. Given that, the answer is the first: you will get 5 total executors.
@Kirk Haslbeck Good question, and thanks for the diagrams. Here are some more details to consider.
It is a good point that each JVM-based worker can have multiple "cores" that run tasks in a multi-threaded environment. There are benefits to running multiple executors on a single node (single JVM) to take advantage of the multi-core processing power, and to reduce the total JVM overhead per executor. Obviously, the JVM has to startup and initialize certain data structures before it can begin running tasks.
From Spark docs, we configure number of cores using these parameters:
spark.driver.cores = Number of cores to use for the driver process
spark.executor.cores = The number of cores to use on each executor
You also want to watch out for this parameter, which can be used to limit the total cores used by Spark across the cluster (i.e., not each worker):
spark.cores.max = the maximum amount of CPU cores to request for the application from across the cluster (not from each machine)
Finally, here is a description from Databricks, aligning the terms "cores" and "slots":
We're using the term “slots” here
to indicate threads available to perform parallel work for Spark. Spark documentation
often refers to these threads as “cores”,
which is a confusing term, as the
number of slots available on a particular machine does not necessarily have any
relationship to the number of physical CPU
cores on that machine."