Support Questions
Find answers, ask questions, and share your expertise

how to configure spark submit configurations based on File size

how to configure spark submit configurations based on File size

Contributor

Hi All,

we know there are formulas available to deteremine Spark job "Executor memory" and "number of Executor" and "executor cores" based on your cluster available Resources, is there any formula available to calculate the same alone with Data size.

case 1: what is the configuration if: data size < 5 GB

case 2: what is the configuration if: 5 GB > data size < 10 GB

case 3: what is the configuration if: 10 GB > data size < 15 GB

case 4: what is the configuration if: 15 GB > data size < 25 GB

case 5: what is the configuration if: data size < 25 GB

Cheers,

MJ

1 REPLY 1
Highlighted

Re: how to configure spark submit configurations based on File size

Super Mentor

@Manikandan Jeyabal

It is hard to provide the exact values based on the data size.

However you can refer to the following article to understand Executor Memory/Core/Resource Optimization.

https://community.hortonworks.com/articles/42803/spark-on-yarn-executor-resource-allocation-optimiz....

https://dzone.com/articles/apache-spark-on-yarn-resource-planning