Reply
Highlighted
umb
Explorer
Posts: 8
Registered: ‎11-07-2014
Accepted Solution

Spark on Yarn Vs Stand alone?

I'm a newbie of spark, and curently stryggling to run first spark-submit job. I initially configured spark on yarn on 4 node cloudera cluster, but if i do so, i could not see master and worker roles. Furthermore i cannot see any instances of spark except History server in cloudera manager UI.  And i'm not sure what to pass for --master argument while doing spark-submit. If i do yarn-cluster nothing happend and job dies automatically, but if i do spark://master-node:7077 there is some error about cannot finding workers. I have to do start-all.sh manually, and even if i do so, it canot find workers.
What actually am i doing wrong here? Is is spark-configuratioon issue with spark on yarn or something else?

Cloudera Employee
Posts: 366
Registered: ‎07-29-2013

Re: Spark on Yarn Vs Stand alone?

Yes, when you run on YARN, you see the driver and executors as YARN containers. It is no longer a stand-alone service. You need to use master "yarn-client" or "yarn-cluster". yarn-client may be simpler to start. Have a look at http://spark.apache.org/docs/latest/cluster-overview.html

Contributor
Posts: 38
Registered: ‎01-05-2015

Re: Spark on Yarn Vs Stand alone?

It is ok to See no spark worker and Master roll in CM?