Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to decide number of ececutors and memory while submitting spark Job to YARN

How to decide number of ececutors and memory while submitting spark Job to YARN

Explorer

I have spark job and while submitting I am giving X number of executors and Y memory however somebody else is also using same cluster and they also want to run several jobs during that time only with X number of executors and Y memory and both of them do not know about each other. In this case how number of executors/memory should be calculated and given to our spark job?

3 REPLIES 3

Re: How to decide number of ececutors and memory while submitting spark Job to YARN

Expert Contributor

Hi @HDave,

You should use the yarn queue manager to be sure that every user or group of user have a minimum of ressource when they submit a job.

https://docs.hortonworks.com/HDPDocuments/Ambari-2.4.3.0/bk_ambari-views/content/using_the_capacity_...

Michel

Re: How to decide number of ececutors and memory while submitting spark Job to YARN

Expert Contributor
@HDave

while the submitting a Spark job to the YARN, make sure to start with the less memory and more executors which gives more parallelism when compared to the more memory and less executors, and it all depends on the size of the Data we are going to process and the operations we are going to and as you have mentioned if USER A submits a spark job with memory X and Y number of executors, if the USER B wants to do the same both the users will get their initial resources if they are having Queue A and Queue B.

But if there is only one Queue and both the users submit the jobs then it becomes FIFO and the other user will be WAITING in the queue which you can find how the capacity scheduler works is from: Capacity Scheduler

Highlighted

Re: How to decide number of ececutors and memory while submitting spark Job to YARN

I see your question the same as my case: https://community.hortonworks.com/questions/118802/run-spark-thrift-servers-with-different-yarn-queu...

But how to run Spark Thrift server with different queue. Now I'm starting Spark Thrift server with ambari UI

Don't have an account?
Coming from Hortonworks? Activate your account here