<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question spark2-submit using pyspark fails in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/spark2-submit-using-pyspark-fails/m-p/89751#M21611</link>
    <description>&lt;P&gt;Hi ,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am upgrading from Spark 1.6.0 to Spark 2.1 on CDH 5.10 platform. I am trying to run spark2-submit command for python implementation and it is failing giving below error. Looks like it is expecting some path property while initilization and creating SparkContext object which is not happening. Error details are as below. Please suggest if any specific configuration is missing or required for spark2.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;sc = SparkContext(conf=conf)&lt;BR /&gt;File "/apps/cloudera/parcels/SPARK2-2.1.0.cloudera1-1.cdh5.7.0.p0.120904/lib/spark2/python/lib/pyspark.zip/pyspark/context.py", line 118, in __init__&lt;BR /&gt;File "/apps/cloudera/parcels/SPARK2-2.1.0.cloudera1-1.cdh5.7.0.p0.120904/lib/spark2/python/lib/pyspark.zip/pyspark/context.py", line 182, in _do_init&lt;BR /&gt;File "/apps/cloudera/parcels/SPARK2-2.1.0.cloudera1-1.cdh5.7.0.p0.120904/lib/spark2/python/lib/pyspark.zip/pyspark/context.py", line 249, in _initialize_context&lt;BR /&gt;File "/apps/cloudera/parcels/SPARK2-2.1.0.cloudera1-1.cdh5.7.0.p0.120904/lib/spark2/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1401, in __call__&lt;BR /&gt;File "/apps/cloudera/parcels/SPARK2-2.1.0.cloudera1-1.cdh5.7.0.p0.120904/lib/spark2/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py", line 319, in get_return_value&lt;BR /&gt;py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.&lt;BR /&gt;: java.lang.IllegalArgumentException: Can not create a Path from an empty string&lt;BR /&gt;at org.apache.hadoop.fs.Path.checkPathArg(Path.java:127)&lt;BR /&gt;at org.apache.hadoop.fs.Path.&amp;lt;init&amp;gt;(Path.java:135)&lt;BR /&gt;at org.apache.hadoop.fs.Path.&amp;lt;init&amp;gt;(Path.java:94)&lt;BR /&gt;at org.apache.spark.deploy.yarn.Client.copyFileToRemote(Client.scala:368)&lt;BR /&gt;at org.apache.spark.deploy.yarn.Client.org$apache$spark$deploy$yarn$Client$$distribute$1(Client.scala:481)&lt;BR /&gt;at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$13.apply(Client.scala:629)&lt;BR /&gt;at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$13.apply(Client.scala:627)&lt;BR /&gt;at scala.collection.mutable.ArraySeq.foreach(ArraySeq.scala:74)&lt;BR /&gt;at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:627)&lt;BR /&gt;at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:874)&lt;BR /&gt;at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:171)&lt;BR /&gt;at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56)&lt;BR /&gt;at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:171)&lt;BR /&gt;at org.apache.spark.SparkContext.&amp;lt;init&amp;gt;(SparkContext.scala:509)&lt;BR /&gt;at org.apache.spark.api.java.JavaSparkContext.&amp;lt;init&amp;gt;(JavaSparkContext.scala:58)&lt;BR /&gt;at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)&lt;BR /&gt;at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)&lt;BR /&gt;at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)&lt;BR /&gt;at java.lang.reflect.Constructor.newInstance(Constructor.java:526)&lt;BR /&gt;at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)&lt;BR /&gt;at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)&lt;BR /&gt;at py4j.Gateway.invoke(Gateway.java:236)&lt;BR /&gt;at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)&lt;BR /&gt;at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)&lt;BR /&gt;at py4j.GatewayConnection.run(GatewayConnection.java:214)&lt;BR /&gt;at java.lang.Thread.run(Thread.java:745)&lt;/P&gt;</description>
    <pubDate>Fri, 16 Sep 2022 14:21:17 GMT</pubDate>
    <dc:creator>techsoln</dc:creator>
    <dc:date>2022-09-16T14:21:17Z</dc:date>
    <item>
      <title>spark2-submit using pyspark fails</title>
      <link>https://community.cloudera.com/t5/Support-Questions/spark2-submit-using-pyspark-fails/m-p/89751#M21611</link>
      <description>&lt;P&gt;Hi ,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am upgrading from Spark 1.6.0 to Spark 2.1 on CDH 5.10 platform. I am trying to run spark2-submit command for python implementation and it is failing giving below error. Looks like it is expecting some path property while initilization and creating SparkContext object which is not happening. Error details are as below. Please suggest if any specific configuration is missing or required for spark2.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;sc = SparkContext(conf=conf)&lt;BR /&gt;File "/apps/cloudera/parcels/SPARK2-2.1.0.cloudera1-1.cdh5.7.0.p0.120904/lib/spark2/python/lib/pyspark.zip/pyspark/context.py", line 118, in __init__&lt;BR /&gt;File "/apps/cloudera/parcels/SPARK2-2.1.0.cloudera1-1.cdh5.7.0.p0.120904/lib/spark2/python/lib/pyspark.zip/pyspark/context.py", line 182, in _do_init&lt;BR /&gt;File "/apps/cloudera/parcels/SPARK2-2.1.0.cloudera1-1.cdh5.7.0.p0.120904/lib/spark2/python/lib/pyspark.zip/pyspark/context.py", line 249, in _initialize_context&lt;BR /&gt;File "/apps/cloudera/parcels/SPARK2-2.1.0.cloudera1-1.cdh5.7.0.p0.120904/lib/spark2/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1401, in __call__&lt;BR /&gt;File "/apps/cloudera/parcels/SPARK2-2.1.0.cloudera1-1.cdh5.7.0.p0.120904/lib/spark2/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py", line 319, in get_return_value&lt;BR /&gt;py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.&lt;BR /&gt;: java.lang.IllegalArgumentException: Can not create a Path from an empty string&lt;BR /&gt;at org.apache.hadoop.fs.Path.checkPathArg(Path.java:127)&lt;BR /&gt;at org.apache.hadoop.fs.Path.&amp;lt;init&amp;gt;(Path.java:135)&lt;BR /&gt;at org.apache.hadoop.fs.Path.&amp;lt;init&amp;gt;(Path.java:94)&lt;BR /&gt;at org.apache.spark.deploy.yarn.Client.copyFileToRemote(Client.scala:368)&lt;BR /&gt;at org.apache.spark.deploy.yarn.Client.org$apache$spark$deploy$yarn$Client$$distribute$1(Client.scala:481)&lt;BR /&gt;at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$13.apply(Client.scala:629)&lt;BR /&gt;at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$13.apply(Client.scala:627)&lt;BR /&gt;at scala.collection.mutable.ArraySeq.foreach(ArraySeq.scala:74)&lt;BR /&gt;at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:627)&lt;BR /&gt;at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:874)&lt;BR /&gt;at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:171)&lt;BR /&gt;at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56)&lt;BR /&gt;at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:171)&lt;BR /&gt;at org.apache.spark.SparkContext.&amp;lt;init&amp;gt;(SparkContext.scala:509)&lt;BR /&gt;at org.apache.spark.api.java.JavaSparkContext.&amp;lt;init&amp;gt;(JavaSparkContext.scala:58)&lt;BR /&gt;at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)&lt;BR /&gt;at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)&lt;BR /&gt;at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)&lt;BR /&gt;at java.lang.reflect.Constructor.newInstance(Constructor.java:526)&lt;BR /&gt;at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)&lt;BR /&gt;at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)&lt;BR /&gt;at py4j.Gateway.invoke(Gateway.java:236)&lt;BR /&gt;at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)&lt;BR /&gt;at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)&lt;BR /&gt;at py4j.GatewayConnection.run(GatewayConnection.java:214)&lt;BR /&gt;at java.lang.Thread.run(Thread.java:745)&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 14:21:17 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/spark2-submit-using-pyspark-fails/m-p/89751#M21611</guid>
      <dc:creator>techsoln</dc:creator>
      <dc:date>2022-09-16T14:21:17Z</dc:date>
    </item>
    <item>
      <title>Re: spark2-submit using pyspark fails</title>
      <link>https://community.cloudera.com/t5/Support-Questions/spark2-submit-using-pyspark-fails/m-p/89815#M21612</link>
      <description>&lt;P&gt;This was failing since my python executable was not in .zip or .egg format. On creation of the executable in .zip format job was accepted.&lt;/P&gt;</description>
      <pubDate>Wed, 01 May 2019 18:51:33 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/spark2-submit-using-pyspark-fails/m-p/89815#M21612</guid>
      <dc:creator>techsoln</dc:creator>
      <dc:date>2019-05-01T18:51:33Z</dc:date>
    </item>
  </channel>
</rss>

