Support Questions
Find answers, ask questions, and share your expertise

How to decide number of ececutors and memory while submitting spark Job to YARN


I have spark job and while submitting I am giving X number of executors and Y memory however somebody else is also using same cluster and they also want to run several jobs during that time only with X number of executors and Y memory and both of them do not know about each other. In this case how number of executors/memory should be calculated and given to our spark job?


Expert Contributor

Hi @HDave,

You should use the yarn queue manager to be sure that every user or group of user have a minimum of ressource when they submit a job.


Expert Contributor

while the submitting a Spark job to the YARN, make sure to start with the less memory and more executors which gives more parallelism when compared to the more memory and less executors, and it all depends on the size of the Data we are going to process and the operations we are going to and as you have mentioned if USER A submits a spark job with memory X and Y number of executors, if the USER B wants to do the same both the users will get their initial resources if they are having Queue A and Queue B.

But if there is only one Queue and both the users submit the jobs then it becomes FIFO and the other user will be WAITING in the queue which you can find how the capacity scheduler works is from: Capacity Scheduler

I see your question the same as my case:

But how to run Spark Thrift server with different queue. Now I'm starting Spark Thrift server with ambari UI

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.