12-07-2014 10:55 PM
I'm a newbie of spark, and curently stryggling to run first spark-submit job. I initially configured spark on yarn on 4 node cloudera cluster, but if i do so, i could not see master and worker roles. Furthermore i cannot see any instances of spark except History server in cloudera manager UI. And i'm not sure what to pass for --master argument while doing spark-submit. If i do yarn-cluster nothing happend and job dies automatically, but if i do spark://master-node:7077 there is some error about cannot finding workers. I have to do start-all.sh manually, and even if i do so, it canot find workers.
What actually am i doing wrong here? Is is spark-configuratioon issue with spark on yarn or something else?
12-08-2014 04:44 AM
Yes, when you run on YARN, you see the driver and executors as YARN containers. It is no longer a stand-alone service. You need to use master "yarn-client" or "yarn-cluster". yarn-client may be simpler to start. Have a look at http://spark.apache.org/docs/latest/cluster-overview.html