Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

PySpark and Python version (<3.6)?

avatar
Rising Star

Hi everyone,

could anyone confirm the information I found in this nice blog entry: How To Locally Install & Configure Apache Spark & Zeppelin

1) Python 3.6 will break PySpark. Use any version < 3.6

2) PySpark doesn’t play nicely w/Python 3.6; any other version will work fine.

Many thanks in advance!

Paul

1 ACCEPTED SOLUTION

avatar

Yes, that's correct for Spark 2.1.0 (among other versions). Please see https://issues.apache.org/jira/browse/SPARK-19019

Per the JIRA, this is resolved in Spark 2.1.1, Spark 2.2.0, etc.

View solution in original post

2 REPLIES 2

avatar

Yes, that's correct for Spark 2.1.0 (among other versions). Please see https://issues.apache.org/jira/browse/SPARK-19019

Per the JIRA, this is resolved in Spark 2.1.1, Spark 2.2.0, etc.

avatar

@slachterman I am facing some issues with PySpark code and some places i see there are compatibility issues so i wanted to check if that is probably the issue. Even otherwise it is better to check these compatibility problems upfraont i guess. So i wanted to know some things.

I am on 2.3.1 spark and 3.6.5 python, do we know if there is a compatibility issue with these? Do i upgrade to 3.7.0 (which i am planning) or downgrade to <3.6? What in your opinion is more sensible?

Info:

versions.. Spark --> spark-2.3.1-bin-hadoop2.7.. all installed according to instructions in python spark course

venkatesh@venkatesh-VirtualBox:~$ java -version</li><li>openjdk version "10.0.1"2018-04-17</li><li>OpenJDKRuntimeEnvironment(build 10.0.1+10-Ubuntu-3ubuntu1)</li><li>OpenJDK64-BitServer VM (build 10.0.1+10-Ubuntu-3ubuntu1, mixed mode)</li></ol>

I work MacOS and Linux.