<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question pyspark using Spark 2.3 in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/pyspark-using-Spark-2-3/m-p/280356#M208808</link>
    <description>&lt;P&gt;We're on CDM 5.14.2 with CDH 5.13.3&lt;/P&gt;
&lt;P&gt;Both Spark 1.6 and Spark 2.3.3 are installed (some apps are still using Spark 1.6, can't remove it yet)&lt;/P&gt;
&lt;P&gt;Now when I'm starting pyspark with config file for Spark2, it still runs pyspark with Spark 1.6&lt;/P&gt;
&lt;P&gt;e.g.&lt;/P&gt;
&lt;P&gt;pyspark --properties-file /etc/spark2/conf/spark-defaults.conf&lt;/P&gt;
&lt;P&gt;it shows after the ASCII Spark logo: version 1.6.0&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;P&gt;using verbose mode it shows the paths are pointing to Spark 2&lt;/P&gt;
&lt;P&gt;spark.yarn.jars,local:/opt/cloudera/parcels/SPARK2-2.3.0.cloudera3-1.cdh5.13.3.p0.458809/lib/spark2/jars/*&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Why is pyspark still referring to Spark 1.6 ?&lt;/P&gt;
&lt;P&gt;How can I force it to use spark 2.3.3 ?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Wed, 16 Oct 2019 18:14:55 GMT</pubDate>
    <dc:creator>jeroenr</dc:creator>
    <dc:date>2019-10-16T18:14:55Z</dc:date>
    <item>
      <title>pyspark using Spark 2.3</title>
      <link>https://community.cloudera.com/t5/Support-Questions/pyspark-using-Spark-2-3/m-p/280356#M208808</link>
      <description>&lt;P&gt;We're on CDM 5.14.2 with CDH 5.13.3&lt;/P&gt;
&lt;P&gt;Both Spark 1.6 and Spark 2.3.3 are installed (some apps are still using Spark 1.6, can't remove it yet)&lt;/P&gt;
&lt;P&gt;Now when I'm starting pyspark with config file for Spark2, it still runs pyspark with Spark 1.6&lt;/P&gt;
&lt;P&gt;e.g.&lt;/P&gt;
&lt;P&gt;pyspark --properties-file /etc/spark2/conf/spark-defaults.conf&lt;/P&gt;
&lt;P&gt;it shows after the ASCII Spark logo: version 1.6.0&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;P&gt;using verbose mode it shows the paths are pointing to Spark 2&lt;/P&gt;
&lt;P&gt;spark.yarn.jars,local:/opt/cloudera/parcels/SPARK2-2.3.0.cloudera3-1.cdh5.13.3.p0.458809/lib/spark2/jars/*&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Why is pyspark still referring to Spark 1.6 ?&lt;/P&gt;
&lt;P&gt;How can I force it to use spark 2.3.3 ?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 16 Oct 2019 18:14:55 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/pyspark-using-Spark-2-3/m-p/280356#M208808</guid>
      <dc:creator>jeroenr</dc:creator>
      <dc:date>2019-10-16T18:14:55Z</dc:date>
    </item>
    <item>
      <title>Re: pyspark using Spark 2.3</title>
      <link>https://community.cloudera.com/t5/Support-Questions/pyspark-using-Spark-2-3/m-p/280464#M208864</link>
      <description>&lt;P&gt;Hey,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Can you please try setting the SPARK_HOME env variable to the location indicated by the readlink command it launches pyspark and shows Spark 2.0 as the version?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;For Example:&lt;/P&gt;&lt;P&gt;export SPARK_HOME=/opt/cloudera/parcels/SPARK2-2.3.0.cloudera3-1.cdh5.13.3.p0.458809/lib/spark2&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;By setting SPARK_HOME to the Spark 2 lib folder instead, pyspark2 will then launch and show Spark 2.3.0.cloudera3 as the spark version.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Please let me know if this helps.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Ankit.&lt;/P&gt;</description>
      <pubDate>Thu, 17 Oct 2019 11:33:30 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/pyspark-using-Spark-2-3/m-p/280464#M208864</guid>
      <dc:creator>TonyStank</dc:creator>
      <dc:date>2019-10-17T11:33:30Z</dc:date>
    </item>
    <item>
      <title>Re: pyspark using Spark 2.3</title>
      <link>https://community.cloudera.com/t5/Support-Questions/pyspark-using-Spark-2-3/m-p/280471#M208871</link>
      <description>&lt;P&gt;thanks, that put me in the right direction&lt;/P&gt;&lt;P&gt;for completeness, just setting SPARK_HOME was not sufficient, it was missing py4j&lt;/P&gt;&lt;P&gt;setting PYTHONPATH fixed that issue&lt;/P&gt;&lt;P&gt;export SPARK_HOME=/opt/cloudera/parcels/SPARK2-2.3.0.cloudera3-1.cdh5.13.3.p0.458809/lib/spark2&lt;BR /&gt;export PYTHONPATH=$SPARK_HOME/python:$SPARK_HOME/python/lib/py4j-0.10.7-src.zip:$PYTHONPATH&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Now pyspark shows: version 2.3.0.cloudera3&lt;/P&gt;</description>
      <pubDate>Thu, 17 Oct 2019 12:47:59 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/pyspark-using-Spark-2-3/m-p/280471#M208871</guid>
      <dc:creator>jeroenr</dc:creator>
      <dc:date>2019-10-17T12:47:59Z</dc:date>
    </item>
  </channel>
</rss>

