Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Spark on Yarn Vs Stand alone?

Solved Go to solution

Spark on Yarn Vs Stand alone?

Explorer

I'm a newbie of spark, and curently stryggling to run first spark-submit job. I initially configured spark on yarn on 4 node cloudera cluster, but if i do so, i could not see master and worker roles. Furthermore i cannot see any instances of spark except History server in cloudera manager UI.  And i'm not sure what to pass for --master argument while doing spark-submit. If i do yarn-cluster nothing happend and job dies automatically, but if i do spark://master-node:7077 there is some error about cannot finding workers. I have to do start-all.sh manually, and even if i do so, it canot find workers.
What actually am i doing wrong here? Is is spark-configuratioon issue with spark on yarn or something else?

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Spark on Yarn Vs Stand alone?

Master Collaborator

Yes, when you run on YARN, you see the driver and executors as YARN containers. It is no longer a stand-alone service. You need to use master "yarn-client" or "yarn-cluster". yarn-client may be simpler to start. Have a look at http://spark.apache.org/docs/latest/cluster-overview.html

2 REPLIES 2

Re: Spark on Yarn Vs Stand alone?

Master Collaborator

Yes, when you run on YARN, you see the driver and executors as YARN containers. It is no longer a stand-alone service. You need to use master "yarn-client" or "yarn-cluster". yarn-client may be simpler to start. Have a look at http://spark.apache.org/docs/latest/cluster-overview.html

Highlighted

Re: Spark on Yarn Vs Stand alone?

Contributor
It is ok to See no spark worker and Master roll in CM?
Don't have an account?
Coming from Hortonworks? Activate your account here