Support Questions

vrsanaidu · ‎04-09-2016

I have downloaded sandbox VM and it is working fine, now i need to create SC manually, like with below code, but how would i know Spark Master name here instead of local.

val conf = new SparkConf().setMaster("local").setAppName("My App")

pminovic · ‎04-10-2016

Hi @Rajendra Vechalapu, you can omit setting master in your source, see this example:

val conf = new SparkConf().setAppName("Spark Pi")
val spark = new SparkContext(conf)

You can then launch your application using spark-submit and provide the master there there using "--master" and "--deploy-mode" options. Refer to Spark programming guide for this and other useful hints.

Edit: When you run spark-submit on Sandbox, be sure to supply additional arguments for master, num-executors, driver-memory, executor-memory, and executor-cores as given below. Note that larger values for last 4 arguments will not work on the Sandbox! Follow (and you can also try) this example computing Pi in Python (as any user who has access to HDFS/Yarn):

cd /usr/hdp/current/spark-client/ 
spark-submit --master yarnclient --num-executors 1 --driver-memory 512m --executor-memory 512m --executor-cores 1 examples/src/main/python/pi.py 10

"--master yarncluster" works too. You can also set these 4 in spark-env in Ambari. They are already there but commented out, and not all with values like here. See also Spark guide on HDP.

View solution in original post

pminovic · ‎04-10-2016

Hi @Rajendra Vechalapu, you can omit setting master in your source, see this example:

val conf = new SparkConf().setAppName("Spark Pi")
val spark = new SparkContext(conf)

You can then launch your application using spark-submit and provide the master there there using "--master" and "--deploy-mode" options. Refer to Spark programming guide for this and other useful hints.

Edit: When you run spark-submit on Sandbox, be sure to supply additional arguments for master, num-executors, driver-memory, executor-memory, and executor-cores as given below. Note that larger values for last 4 arguments will not work on the Sandbox! Follow (and you can also try) this example computing Pi in Python (as any user who has access to HDFS/Yarn):

cd /usr/hdp/current/spark-client/ 
spark-submit --master yarnclient --num-executors 1 --driver-memory 512m --executor-memory 512m --executor-cores 1 examples/src/main/python/pi.py 10

"--master yarncluster" works too. You can also set these 4 in spark-env in Ambari. They are already there but commented out, and not all with values like here. See also Spark guide on HDP.

vrsanaidu · ‎04-10-2016

Thx , Working.

jr_chio · ‎04-24-2016

If I want to run from external of the VM, for example, in Eclipse, what's the master ip and port? Thanks.

Cloudera Community

Support Questions

Help Spark Master for Sandbox VM?