I am new to spark , I am trying to submit the spark application from the Java program and I am able to submit the one for spark standalone cluster .Actually what I want to achieve is submitting the job to the Yarn cluster and I am able to connect to the yarn cluster by explicitly adding the Resource Manager property in the spark config as below .
I am able to submit the job to YARN through the spark-submit command ,but what actually I am looking here is for doing the same thing trough the program . It would be great if you would give the template for the same, java preferably .
If your code depends on other projects, you will need to package them alongside your application in order to distribute the code to a Spark cluster. To do this, create an assembly jar (or “uber” jar) containing your code and its dependencies. Both sbt and Maven have assembly plugins. When creating assembly jars, list Spark and Hadoop as provided dependencies; these need not be bundled since they are provided by the cluster manager at runtime. Once you have an assembled jar you can call the bin/spark-submit script as shown here while passing your jar.
For Python, you can use the --py-files argument of spark-submit to add .py, .zip or .egg files to be distributed with your application. If you depend on multiple Python files we recommend packaging them into a .zip or .egg.
Thanks again ! And Sorry If I am asking too many questions here .
What actually I am looking for is ..I should not use the spark-submit script as per the project requirement , So the cluster configuration I am passing through the spark config as given below .
SparkConf sparkConfig = new SparkConf().setAppName("Example App of Spark on Yarn");
And it is able to identify the Resource Manager but it failing because it is not identifying the file system .
Though I am setting the hdfs file system configuration as well.
sparkConfig.set("ha.zookeeper.quorum", "xxx:2181,xxxx:2181,xxxx:2181"); And it assuming it as the local file system. And error I am getting in the Resource Manager is
exited with exitCode: -1000 due to: File file:/tmp/spark-0e6626c2-d344-4cae-897f-934e3eb01d8f/__spark_libs__1448521825653017037.zip does not exist