<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Version of Python of Pyspark for Spark2 and Zeppelin in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Version-of-Python-of-Pyspark-for-Spark2-and-Zeppelin/m-p/227022#M188882</link>
    <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/69439/faradasa.html" nodeid="69439" target="_blank"&gt;@Sungwoo Park&lt;/A&gt; &lt;/P&gt;&lt;P&gt;Try installing anaconda3 on /opt/anaconda3 instead of under /root. And add the following configuration to your interpreter: &lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="76467-screen-shot-2018-05-29-at-100214-am.png" style="width: 1679px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/15317i14B7F3BEB6ED972F/image-size/medium?v=v2&amp;amp;px=400" role="button" title="76467-screen-shot-2018-05-29-at-100214-am.png" alt="76467-screen-shot-2018-05-29-at-100214-am.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;The results while having this configuration is:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="76468-screen-shot-2018-05-29-at-100156-am.png" style="width: 2017px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/15318i179B956823D5CD48/image-size/medium?v=v2&amp;amp;px=400" role="button" title="76468-screen-shot-2018-05-29-at-100156-am.png" alt="76468-screen-shot-2018-05-29-at-100156-am.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Important&lt;/STRONG&gt;: Since zeppelin runs spark2 interpreter in yarn-client mode by default you need to make sure the /root/anaconda3/bin/python3 is installed on the zeppelin machine and on all cluster worker nodes. &lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Additional resources&lt;/STRONG&gt; &lt;/P&gt;&lt;P&gt;&lt;A href="https://community.hortonworks.com/content/supportkb/146508/how-to-use-alternate-python-version-for-spark-in-z.html" target="_blank" rel="nofollow noopener noreferrer"&gt;https://community.hortonworks.com/content/supportkb/146508/how-to-use-alternate-python-version-for-spark-in-z.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;HTH&lt;/P&gt;&lt;P&gt;*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.&lt;/P&gt;</description>
    <pubDate>Sun, 18 Aug 2019 01:22:29 GMT</pubDate>
    <dc:creator>falbani</dc:creator>
    <dc:date>2019-08-18T01:22:29Z</dc:date>
    <item>
      <title>Version of Python of Pyspark for Spark2 and Zeppelin</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Version-of-Python-of-Pyspark-for-Spark2-and-Zeppelin/m-p/227019#M188879</link>
      <description>&lt;P&gt;Hi.&lt;/P&gt;&lt;P&gt;I built a cluster with HDP ambari Version
      2.6.1.5 and I am using anaconda3 as my python interpreter.&lt;/P&gt;&lt;P&gt;I have a problem of changing or alter python version for Spark2 pyspark in zeppelin&lt;/P&gt;&lt;P&gt;When I check python version of Spark2 by pyspark, it shows as bellow which means OK to me.&lt;/P&gt;&lt;PRE&gt;      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /__ / .__/\_,_/_/ /_/\_\   version 2.2.0.2.6.4.0-91
      /_/

Using Python version 3.6.4 (default, Jan 16 2018 18:10:19)
SparkSession available as 'spark'.
&amp;gt;&amp;gt;&amp;gt; import sys
&amp;gt;&amp;gt;&amp;gt; print (sys.path)
['', '/tmp/spark-14a0fb52-5fea-4c1f-bf6b-c0bd0c37eedf/userFiles-54205d05-fbf0-4ec1-b274-4c5a2b78e840', '/usr/hdp/current/spark2-client/python/lib/py4j-0.10.4-src.zip', '/usr/hdp/current/spark2-client/python', '/root', '/root/anaconda3/lib/python36.zip', '/root/anaconda3/lib/python3.6', '/root/anaconda3/lib/python3.6/lib-dynload', '/root/anaconda3/lib/python3.6/site-packages']
&amp;gt;&amp;gt;&amp;gt; print (sys.version)
3.6.4 |Anaconda, Inc.| (default, Jan 16 2018, 18:10:19)
[GCC 7.2.0]
&amp;gt;&amp;gt;&amp;gt; exit()

&lt;/PRE&gt;&lt;P&gt;When I check python version of Spark2 by zeppelin, it shows different results as below.&lt;/P&gt;&lt;PRE&gt;%spark2.pyspark
print(sc.version)
import sys
print(sys.version)
print()
print(sys.path)

2.2.0.2.6.4.0-91
2.7.5 (default, Aug  4 2017, 00:39:18) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-16)]
()
['/tmp', u'/tmp/spark-75f5d1d5-fefa-4dc8-bc9b-c797dec106d7/userFiles-1c25cf01-7758-49dd-a1eb-f1fbd084e9af/py4j-0.10.4-src.zip', u'/tmp/spark-75f5d1d5-fefa-4dc8-bc9b-c797dec106d7/userFiles-1c25cf01-7758-49dd-a1eb-f1fbd084e9af/pyspark.zip', u'/tmp/spark-75f5d1d5-fefa-4dc8-bc9b-c797dec106d7/userFiles-1c25cf01-7758-49dd-a1eb-f1fbd084e9af', '/usr/hdp/current/spark2-client/python/lib/py4j-0.10.4-src.zip', '/usr/hdp/current/spark2-client/python', '/usr/hdp/current/spark2-client/python/lib/py4j-0.8.2.1-src.zip', '/usr/lib64/python27.zip', '/usr/lib64/python2.7', '/usr/lib64/python2.7/plat-linux2', '/usr/lib64/python2.7/lib-tk', '/usr/lib64/python2.7/lib-old', '/usr/lib64/python2.7/lib-dynload', '/usr/lib64/python2.7/site-packages', '/usr/lib64/python2.7/site-packages/gtk-2.0', '/usr/lib/python2.7/site-packages']

&lt;/PRE&gt;&lt;P&gt;I have tried to update zeppelin interpreter setting known by other questions and answers such as&lt;/P&gt;&lt;PRE&gt;export PYSPARK_PYTHON = /root/anaconda3/bin/python&lt;/PRE&gt;&lt;P&gt;I updated both zeppelin.env.sh and interpreter setting via zeppelin GUI but it didn't work.&lt;/P&gt;&lt;P&gt;I think it cause because zeppelin's python path is heading /usr/lib64/python2.7 which is base for centos but I don't know how to fix it.&lt;/P&gt;&lt;P&gt;If there is any idea of this problem, please let me know. Any of your advice would be appreciated.&lt;/P&gt;&lt;P&gt;Thank you.&lt;/P&gt;</description>
      <pubDate>Fri, 27 Apr 2018 18:11:24 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Version-of-Python-of-Pyspark-for-Spark2-and-Zeppelin/m-p/227019#M188879</guid>
      <dc:creator>faradasa</dc:creator>
      <dc:date>2018-04-27T18:11:24Z</dc:date>
    </item>
    <item>
      <title>Re: Version of Python of Pyspark for Spark2 and Zeppelin</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Version-of-Python-of-Pyspark-for-Spark2-and-Zeppelin/m-p/227020#M188880</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/69439/faradasa.html" nodeid="69439"&gt;@Sungwoo Park&lt;/A&gt;,&lt;/P&gt;&lt;P&gt;You can have a look at this question. I think it would help you : &lt;A href="https://stackoverflow.com/questions/47198678/zeppelin-python-conda-and-python-sql-interpreters-do-not-work-without-adding-a" target="_blank"&gt;https://stackoverflow.com/questions/47198678/zeppelin-python-conda-and-python-sql-interpreters-do-not-work-without-adding-a&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Best regards, &lt;/P&gt;&lt;P&gt;Paul &lt;/P&gt;</description>
      <pubDate>Sat, 28 Apr 2018 16:49:11 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Version-of-Python-of-Pyspark-for-Spark2-and-Zeppelin/m-p/227020#M188880</guid>
      <dc:creator>paulhernandez</dc:creator>
      <dc:date>2018-04-28T16:49:11Z</dc:date>
    </item>
    <item>
      <title>Re: Version of Python of Pyspark for Spark2 and Zeppelin</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Version-of-Python-of-Pyspark-for-Spark2-and-Zeppelin/m-p/227021#M188881</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/11244/paulhernandez.html" nodeid="11244"&gt;@Paul Hernandez&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Thank you for your comment.&lt;/P&gt;&lt;P&gt;I checked the post you told me and found it is not a good idea: changing symlink in bin/.&lt;/P&gt;&lt;P&gt;It might make trouble to linux system.&lt;/P&gt;</description>
      <pubDate>Tue, 29 May 2018 13:11:22 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Version-of-Python-of-Pyspark-for-Spark2-and-Zeppelin/m-p/227021#M188881</guid>
      <dc:creator>faradasa</dc:creator>
      <dc:date>2018-05-29T13:11:22Z</dc:date>
    </item>
    <item>
      <title>Re: Version of Python of Pyspark for Spark2 and Zeppelin</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Version-of-Python-of-Pyspark-for-Spark2-and-Zeppelin/m-p/227022#M188882</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/69439/faradasa.html" nodeid="69439" target="_blank"&gt;@Sungwoo Park&lt;/A&gt; &lt;/P&gt;&lt;P&gt;Try installing anaconda3 on /opt/anaconda3 instead of under /root. And add the following configuration to your interpreter: &lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="76467-screen-shot-2018-05-29-at-100214-am.png" style="width: 1679px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/15317i14B7F3BEB6ED972F/image-size/medium?v=v2&amp;amp;px=400" role="button" title="76467-screen-shot-2018-05-29-at-100214-am.png" alt="76467-screen-shot-2018-05-29-at-100214-am.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;The results while having this configuration is:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="76468-screen-shot-2018-05-29-at-100156-am.png" style="width: 2017px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/15318i179B956823D5CD48/image-size/medium?v=v2&amp;amp;px=400" role="button" title="76468-screen-shot-2018-05-29-at-100156-am.png" alt="76468-screen-shot-2018-05-29-at-100156-am.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Important&lt;/STRONG&gt;: Since zeppelin runs spark2 interpreter in yarn-client mode by default you need to make sure the /root/anaconda3/bin/python3 is installed on the zeppelin machine and on all cluster worker nodes. &lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Additional resources&lt;/STRONG&gt; &lt;/P&gt;&lt;P&gt;&lt;A href="https://community.hortonworks.com/content/supportkb/146508/how-to-use-alternate-python-version-for-spark-in-z.html" target="_blank" rel="nofollow noopener noreferrer"&gt;https://community.hortonworks.com/content/supportkb/146508/how-to-use-alternate-python-version-for-spark-in-z.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;HTH&lt;/P&gt;&lt;P&gt;*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.&lt;/P&gt;</description>
      <pubDate>Sun, 18 Aug 2019 01:22:29 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Version-of-Python-of-Pyspark-for-Spark2-and-Zeppelin/m-p/227022#M188882</guid>
      <dc:creator>falbani</dc:creator>
      <dc:date>2019-08-18T01:22:29Z</dc:date>
    </item>
    <item>
      <title>Re: Version of Python of Pyspark for Spark2 and Zeppelin</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Version-of-Python-of-Pyspark-for-Spark2-and-Zeppelin/m-p/227023#M188883</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/69439/faradasa.html" nodeid="69439"&gt;@Sungwoo Park&lt;/A&gt;, thanks for the input. Could you please elaborate a little bit more, why could the symlink cause problems, and which ones?&lt;/P&gt;&lt;P&gt;I am very interesting since we have this settings in a demo cluster within a customer. &lt;/P&gt;&lt;P&gt;BR. Paul&lt;/P&gt;</description>
      <pubDate>Tue, 29 May 2018 23:02:32 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Version-of-Python-of-Pyspark-for-Spark2-and-Zeppelin/m-p/227023#M188883</guid>
      <dc:creator>paulhernandez</dc:creator>
      <dc:date>2018-05-29T23:02:32Z</dc:date>
    </item>
    <item>
      <title>Re: Version of Python of Pyspark for Spark2 and Zeppelin</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Version-of-Python-of-Pyspark-for-Spark2-and-Zeppelin/m-p/227024#M188884</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/11244/paulhernandez.html" nodeid="11244"&gt;@Paul Hernandez&lt;/A&gt; &lt;/P&gt;&lt;P&gt;First of all, my problem has solved by adding zeppelin properties like &lt;A rel="user" href="https://community.cloudera.com/users/11048/falbani.html" nodeid="11048"&gt;@Felix Albani&lt;/A&gt; show me.&lt;/P&gt;&lt;P&gt;In my case, my cluster is based on CentOS 7. &lt;/P&gt;&lt;P&gt;The OS has python 2.7 as default and some packages such as yum have dependency on the default python. The symlink '/bin/python' is heading this default python and if it is changed, yum is not working any more.&lt;/P&gt;&lt;P&gt;Hope this help.&lt;/P&gt;&lt;P&gt;SW&lt;/P&gt;</description>
      <pubDate>Wed, 30 May 2018 08:34:23 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Version-of-Python-of-Pyspark-for-Spark2-and-Zeppelin/m-p/227024#M188884</guid>
      <dc:creator>faradasa</dc:creator>
      <dc:date>2018-05-30T08:34:23Z</dc:date>
    </item>
    <item>
      <title>Re: Version of Python of Pyspark for Spark2 and Zeppelin</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Version-of-Python-of-Pyspark-for-Spark2-and-Zeppelin/m-p/227025#M188885</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/11048/falbani.html" nodeid="11048"&gt;@Felix Albani&lt;/A&gt; Hi felix, you installed 3.6.4, but according to the document spark2 can only support up to 3.4.x, Can you kindly explain how does this work ?&lt;/P&gt;</description>
      <pubDate>Thu, 31 May 2018 16:34:35 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Version-of-Python-of-Pyspark-for-Spark2-and-Zeppelin/m-p/227025#M188885</guid>
      <dc:creator>144675</dc:creator>
      <dc:date>2018-05-31T16:34:35Z</dc:date>
    </item>
  </channel>
</rss>

