Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Spark on Yarn Vs Stand alone?

avatar
Explorer

I'm a newbie of spark, and curently stryggling to run first spark-submit job. I initially configured spark on yarn on 4 node cloudera cluster, but if i do so, i could not see master and worker roles. Furthermore i cannot see any instances of spark except History server in cloudera manager UI.  And i'm not sure what to pass for --master argument while doing spark-submit. If i do yarn-cluster nothing happend and job dies automatically, but if i do spark://master-node:7077 there is some error about cannot finding workers. I have to do start-all.sh manually, and even if i do so, it canot find workers.
What actually am i doing wrong here? Is is spark-configuratioon issue with spark on yarn or something else?

1 ACCEPTED SOLUTION

avatar
Master Collaborator

Yes, when you run on YARN, you see the driver and executors as YARN containers. It is no longer a stand-alone service. You need to use master "yarn-client" or "yarn-cluster". yarn-client may be simpler to start. Have a look at http://spark.apache.org/docs/latest/cluster-overview.html

View solution in original post

2 REPLIES 2

avatar
Master Collaborator

Yes, when you run on YARN, you see the driver and executors as YARN containers. It is no longer a stand-alone service. You need to use master "yarn-client" or "yarn-cluster". yarn-client may be simpler to start. Have a look at http://spark.apache.org/docs/latest/cluster-overview.html

avatar
Contributor
It is ok to See no spark worker and Master roll in CM?