Support Questions

Find answers, ask questions, and share your expertise

Help Spark Master for Sandbox VM?

avatar
Contributor

I have downloaded sandbox VM and it is working fine, now i need to create SC manually, like with below code, but how would i know Spark Master name here instead of local.

val conf = new SparkConf().setMaster("local").setAppName("My App")

1 ACCEPTED SOLUTION

avatar
Master Guru

Hi @Rajendra Vechalapu, you can omit setting master in your source, see this example:

val conf = new SparkConf().setAppName("Spark Pi")
val spark = new SparkContext(conf)

You can then launch your application using spark-submit and provide the master there there using "--master" and "--deploy-mode" options. Refer to Spark programming guide for this and other useful hints.

Edit: When you run spark-submit on Sandbox, be sure to supply additional arguments for master, num-executors, driver-memory, executor-memory, and executor-cores as given below. Note that larger values for last 4 arguments will not work on the Sandbox! Follow (and you can also try) this example computing Pi in Python (as any user who has access to HDFS/Yarn):

cd /usr/hdp/current/spark-client/ 
spark-submit --master yarnclient --num-executors 1 --driver-memory 512m --executor-memory 512m --executor-cores 1 examples/src/main/python/pi.py 10

"--master yarncluster" works too. You can also set these 4 in spark-env in Ambari. They are already there but commented out, and not all with values like here. See also Spark guide on HDP.

View solution in original post

3 REPLIES 3

avatar
Master Guru

Hi @Rajendra Vechalapu, you can omit setting master in your source, see this example:

val conf = new SparkConf().setAppName("Spark Pi")
val spark = new SparkContext(conf)

You can then launch your application using spark-submit and provide the master there there using "--master" and "--deploy-mode" options. Refer to Spark programming guide for this and other useful hints.

Edit: When you run spark-submit on Sandbox, be sure to supply additional arguments for master, num-executors, driver-memory, executor-memory, and executor-cores as given below. Note that larger values for last 4 arguments will not work on the Sandbox! Follow (and you can also try) this example computing Pi in Python (as any user who has access to HDFS/Yarn):

cd /usr/hdp/current/spark-client/ 
spark-submit --master yarnclient --num-executors 1 --driver-memory 512m --executor-memory 512m --executor-cores 1 examples/src/main/python/pi.py 10

"--master yarncluster" works too. You can also set these 4 in spark-env in Ambari. They are already there but commented out, and not all with values like here. See also Spark guide on HDP.

avatar
Contributor

Thx , Working.

avatar
Explorer

If I want to run from external of the VM, for example, in Eclipse, what's the master ip and port? Thanks.