<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Post Job to Spark via YARN from VM on a virtual network in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Post-Job-to-Spark-via-YARN-from-VM-on-a-virtual-network/m-p/161485#M123864</link>
    <description>&lt;P&gt;You can designate either way by setting --master and --deploy-mode arguments correctly. By designating --master=yarn, the Spark executors will be run on the cluster; --master=local[*] will place the executors on the local machine. The Spark driver location will then be determined by one of these modes: --deploy-mode=cluster runs driver on cluster, --deploy-mode=client runs driver on client (VM where it is launched). &lt;/P&gt;&lt;P&gt;More info here: &lt;/P&gt;&lt;P&gt;&lt;A href="http://spark.apache.org/docs/latest/submitting-applications.html"&gt;http://spark.apache.org/docs/latest/submitting-applications.html&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Wed, 29 Jun 2016 04:46:09 GMT</pubDate>
    <dc:creator>phargis</dc:creator>
    <dc:date>2016-06-29T04:46:09Z</dc:date>
    <item>
      <title>Post Job to Spark via YARN from VM on a virtual network</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Post-Job-to-Spark-via-YARN-from-VM-on-a-virtual-network/m-p/161482#M123861</link>
      <description>&lt;P&gt;I have a function Spark cluster that can accept jobs via YARN (i.e. I can start a job by running spark-submit and specify the master as yarn-client).&lt;/P&gt;&lt;P&gt;Now, I have a virtual machine set up on a virtual network with this Spark cluster. On the VM, I am working with Titan DB, which in its configuration allows us to set spark.master. If I set it as local[*], everything runs well. However, if I set spark.master as yarn-client, I get the following error:&lt;/P&gt;&lt;PRE&gt;java.lang.IllegalStateException: java.lang.ExceptionInInitializerError
        at org.apache.tinkerpop.gremlin.process.computer.traversal.step.map.ComputerResultStep.processNextStart(ComputerResultStep.java:82)
        at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:140)
        at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.hasNext(DefaultTraversal.java:117)
        at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:215)
        at org.apache.tinkerpop.gremlin.console.Console$_closure3.doCall(Console.groovy:205)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:90)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:324)
        at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:292)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1016)
        at org.codehaus.groovy.tools.shell.Groovysh.setLastResult(Groovysh.groovy:441)
        at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:215)
        at org.codehaus.groovy.tools.shell.Groovysh.execute(Groovysh.groovy:185)
        at org.codehaus.groovy.tools.shell.Shell.leftShift(Shell.groovy:119)
        at org.codehaus.groovy.tools.shell.ShellRunner.work(ShellRunner.groovy:94)
        at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super$2$work(InteractiveShellRunner.groovy)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:90)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:324)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1207)
        at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:130)
        at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:150)
        at org.codehaus.groovy.tools.shell.InteractiveShellRunner.work(InteractiveShellRunner.groovy:123)
        at org.codehaus.groovy.tools.shell.ShellRunner.run(ShellRunner.groovy:58)
        at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super$2$run(InteractiveShellRunner.groovy)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:90)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:324)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1207)
        at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:130)
        at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:150)
        at org.codehaus.groovy.tools.shell.InteractiveShellRunner.run(InteractiveShellRunner.groovy:82)
        at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:215)
        at org.apache.tinkerpop.gremlin.console.Console.&amp;lt;init&amp;gt;(Console.groovy:144)
        at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:215)
        at org.apache.tinkerpop.gremlin.console.Console.main(Console.groovy:303)
Caused by: java.util.concurrent.ExecutionException: java.lang.ExceptionInInitializerError
        at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
        at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
        at org.apache.tinkerpop.gremlin.process.computer.traversal.step.map.ComputerResultStep.processNextStart(ComputerResultStep.java:80)
        ... 44 more
Caused by: java.lang.ExceptionInInitializerError
        at org.apache.spark.util.Utils$.getSparkOrYarnConfig(Utils.scala:1873)
        at org.apache.spark.storage.BlockManager.&amp;lt;init&amp;gt;(BlockManager.scala:105)
        at org.apache.spark.storage.BlockManager.&amp;lt;init&amp;gt;(BlockManager.scala:180)
        at org.apache.spark.SparkEnv$.create(SparkEnv.scala:308)
        at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:159)
        at org.apache.spark.SparkContext.&amp;lt;init&amp;gt;(SparkContext.scala:240)
        at org.apache.spark.api.java.JavaSparkContext.&amp;lt;init&amp;gt;(JavaSparkContext.scala:61)
        at org.apache.tinkerpop.gremlin.hadoop.process.computer.spark.SparkGraphComputer.lambda$submit$31(SparkGraphComputer.java:111)
        at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
        at java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1582)
        at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
        at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
        at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
        at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
Caused by: org.apache.spark.SparkException: Unable to load YARN support
        at org.apache.spark.deploy.SparkHadoopUtil$.liftedTree1$1(SparkHadoopUtil.scala:211)
        at org.apache.spark.deploy.SparkHadoopUtil$.&amp;lt;init&amp;gt;(SparkHadoopUtil.scala:206)
        at org.apache.spark.deploy.SparkHadoopUtil$.&amp;lt;clinit&amp;gt;(SparkHadoopUtil.scala)
        ... 14 more
Caused by: java.lang.ClassNotFoundException: org.apache.spark.deploy.yarn.YarnSparkHadoopUtil
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:264)
        at org.apache.spark.deploy.SparkHadoopUtil$.liftedTree1$1(SparkHadoopUtil.scala:207)
        ... 16 more
&lt;/PRE&gt;&lt;P&gt;Clearly, there is some configuration to do (I believe on the VM end). I suspect we will have to install YARN on the VM, and have that yarn-client communicate with the yarn-client on the Spark cluster, but I am not sure. Can you point me in the right direction or tell me how to configure the VM and/or Spark so I can successfully submit this job? I can provide any more information that might be helpful.&lt;/P&gt;</description>
      <pubDate>Wed, 29 Jun 2016 01:34:40 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Post-Job-to-Spark-via-YARN-from-VM-on-a-virtual-network/m-p/161482#M123861</guid>
      <dc:creator>zachkirsch</dc:creator>
      <dc:date>2016-06-29T01:34:40Z</dc:date>
    </item>
    <item>
      <title>Re: Post Job to Spark via YARN from VM on a virtual network</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Post-Job-to-Spark-via-YARN-from-VM-on-a-virtual-network/m-p/161483#M123862</link>
      <description>&lt;P&gt;
	You probably need to install the spark-client on your VM, which will include all the proper jar files and binaries to connect to YARN. There is also a chance that the version of Spark used by Titan DB was built specifically without YARN dependencies (to avoid duplicates). You can always rebuild your local Spark installation with YARN dependencies, using the instructions here:&lt;/P&gt;&lt;P&gt;
	&lt;A href="http://spark.apache.org/docs/latest/building-spark.html"&gt;http://spark.apache.org/docs/latest/building-spark.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;
	For instance, here is a sample build command using maven:&lt;/P&gt;
&lt;PRE&gt;build/mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -DskipTests clean package
&lt;/PRE&gt;</description>
      <pubDate>Wed, 29 Jun 2016 02:03:25 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Post-Job-to-Spark-via-YARN-from-VM-on-a-virtual-network/m-p/161483#M123862</guid>
      <dc:creator>phargis</dc:creator>
      <dc:date>2016-06-29T02:03:25Z</dc:date>
    </item>
    <item>
      <title>Re: Post Job to Spark via YARN from VM on a virtual network</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Post-Job-to-Spark-via-YARN-from-VM-on-a-virtual-network/m-p/161484#M123863</link>
      <description>&lt;P&gt;Thanks for the reply. An issue that I'm having is that I'm not sure how to designate that the job is distributed to the existing Spark cluster (rather than just run on the VM). &lt;/P&gt;</description>
      <pubDate>Wed, 29 Jun 2016 02:28:35 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Post-Job-to-Spark-via-YARN-from-VM-on-a-virtual-network/m-p/161484#M123863</guid>
      <dc:creator>zachkirsch</dc:creator>
      <dc:date>2016-06-29T02:28:35Z</dc:date>
    </item>
    <item>
      <title>Re: Post Job to Spark via YARN from VM on a virtual network</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Post-Job-to-Spark-via-YARN-from-VM-on-a-virtual-network/m-p/161485#M123864</link>
      <description>&lt;P&gt;You can designate either way by setting --master and --deploy-mode arguments correctly. By designating --master=yarn, the Spark executors will be run on the cluster; --master=local[*] will place the executors on the local machine. The Spark driver location will then be determined by one of these modes: --deploy-mode=cluster runs driver on cluster, --deploy-mode=client runs driver on client (VM where it is launched). &lt;/P&gt;&lt;P&gt;More info here: &lt;/P&gt;&lt;P&gt;&lt;A href="http://spark.apache.org/docs/latest/submitting-applications.html"&gt;http://spark.apache.org/docs/latest/submitting-applications.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 29 Jun 2016 04:46:09 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Post-Job-to-Spark-via-YARN-from-VM-on-a-virtual-network/m-p/161485#M123864</guid>
      <dc:creator>phargis</dc:creator>
      <dc:date>2016-06-29T04:46:09Z</dc:date>
    </item>
  </channel>
</rss>

