Hello All, In Hadoop MapReduce, By default, the number of mappers created is depends on number of input splits. For example, if 192 MB is your inpur file size and 1 block is of 64 MB then number of input splits will be 3. So number of mappers will be 3. The same way, I would like to know that, In spark, if i submit an application in standalone cluster(a sort of pseudo distributed) to process 750 MB input data, how many executors will be created in Spark?
@Saravanan Selvam, In yarn mode you can control the total number of executors needed for an application with --num-executor option.
However, if you do not explicitly specify --num-executor for spark application in yarn mode, it would typically start one executor on each Nodemanager.
Spark also has a feature called Dynamic resource allocation. It gives spark application a feature to dynamically scale the set of cluster resources allocated to your application up and down based on the workload. This way you can make sure that application is not over utilizing the resources.