Hi Team,
It would be appreciated if someone please guide me how to set spark memory for spark job, where cluster utilization should take 1%-2% memory only for each spark job. Please share math's logic how to calculate on below cluster node details as -
#1 How many working Nodes Cluster we have currently? |
>Nodemanagers:166
>Datanodes:159
#2 How many Cores per Node we have currently ? |
>64 Cores
#3 How much GB RAM per node. we have currently ? |
>503 GB
==== Wanted to calcuate for spark job ===
#1driver-memory
#2 executor-memory
#3 driver-cores
#4 executor-cores
#5 num-executor
========================
Please suggest if any additional parameter help to tune the spark job (execution time and cluster utilization) wise.