Support Questions

Find answers, ask questions, and share your expertise

How to tune spark job on (execution time wise and cluster utilization wise)

avatar
Contributor

Hi Team,

 

It would be appreciated if someone please guide me how to set spark memory for spark job, where cluster utilization should take 1%-2% memory only for each spark job. Please share math's logic how to calculate on below cluster node details as -

 

#1 How many working Nodes Cluster we have currently? |
>Nodemanagers:166 
>Datanodes:159 

 

#2 How many Cores per Node we have currently ? |
>64 Cores 

 

#3 How much GB RAM per node. we have currently ? |
>503 GB 

 

 

==== Wanted to calcuate for spark job ===

#1driver-memory

#2 executor-memory

#3 driver-cores

#4 executor-cores

#5 num-executor

========================

Please suggest if any additional parameter help to tune the spark job (execution time and cluster utilization) wise.

1 ACCEPTED SOLUTION

avatar
Master Collaborator

Use the following tool to generate no of executors:

https://rangareddy.github.io/SparkConfigurationGenerator/

 

In order to calculate the driver memory/executor memory we need to start with 1g, 2g, 4g, 8g .... and executor-cores you can set 3-5 and number of executor it will depend on data how much you are processing.

 

View solution in original post

1 REPLY 1

avatar
Master Collaborator

Use the following tool to generate no of executors:

https://rangareddy.github.io/SparkConfigurationGenerator/

 

In order to calculate the driver memory/executor memory we need to start with 1g, 2g, 4g, 8g .... and executor-cores you can set 3-5 and number of executor it will depend on data how much you are processing.