Support Questions
Find answers, ask questions, and share your expertise

PySpark in Zeppelin: Does not have all libraries

Guru

Iam able to import a library in pyspark shell without any problems, but when I try to import the same library in Zeppelin, I get an error

ImportError: No module named xxxxx

1 ACCEPTED SOLUTION

Accepted Solutions

Re: PySpark in Zeppelin: Does not have all libraries

6 REPLIES 6

Re: PySpark in Zeppelin: Does not have all libraries

Do you have multiple Python versions installed?

Re: PySpark in Zeppelin: Does not have all libraries

Re: PySpark in Zeppelin: Does not have all libraries

Guru

I had 2 versions of Python installed. Zeppelin is still using the older one.

Re: PySpark in Zeppelin: Does not have all libraries

For anyone who else may encounter this issue and end up here: this is most commonly the result of having multiple python versions installed. However if you are using Zeppelin (which is the case here), it is pretty easy to point to different version of python. In Zeppelin UI > Interpreter > Spark > Change the 'zeppelin.pyspark.python' property from 'python' to '/path/to/correct/pythondir/python' and click Save

Re: PySpark in Zeppelin: Does not have all libraries

Guru
@Ali Bajwa

Should it be the python directory or the pyspark directory?

as in /usr/loca/../python or /usr/hdp/2..../spark/python

Re: PySpark in Zeppelin: Does not have all libraries

@Vedant Jain: I believe it should be the full path to the python executable you wish to use (assuming you don't want to use the default)