- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Pyspark Interpreter not working on Zeppelin
- Labels:
-
Apache Zeppelin
Created on ‎03-05-2018 09:06 PM - edited ‎08-18-2019 02:01 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I'm using HDP sandbox 2.6.4, with Zeppelin Notebook installed.
When I want to use Pyspark on Zeppelin, it won't work...
Example :
%pyspark print "Test"
Out:
Traceback (most recent call last): File "/tmp/zeppelin_pyspark-8142801691187202169.py", line 302, in <module> __zeppelin__._setup_matplotlib() File "/tmp/zeppelin_pyspark-8142801691187202169.py", line 141, in _setup_matplotlib import backend_zinline File "/usr/hdp/current/zeppelin-server/interpreter/lib/python/backend_zinline.py", line 30, in <module> import mpl_config File "/usr/hdp/current/zeppelin-server/interpreter/lib/python/mpl_config.py", line 99, in <module> _init_config() File "/usr/hdp/current/zeppelin-server/interpreter/lib/python/mpl_config.py", line 83, in _init_config fmt = matplotlib.rcParams['savefig.format'] KeyError: 'savefig.format'
I can't cancel the execution...
And on the Resource Manager UI, the job is running indefinitely : (See attached png file)
Thank you for your help
Created ‎03-05-2018 10:03 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
According to this JIRA : https://issues.apache.org/jira/browse/ZEPPELIN-3094
The issue is the version of the package matplotlib. I've got this version : 0.99.1.1 but the minimum version required is 1.2.x
With pip I can't upgrade the version because of the version of python, that is 2.6.6 and so depreciated.
Created on ‎03-05-2018 10:41 PM - edited ‎08-18-2019 02:01 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Problem solved !
First : Install Python 2.7 using this tuto : https://tecadmin.net/install-python-2-7-on-centos-rhel/
Second : Install matplotlib with python2.7 : python2.7 pip install matplotlib
Third : Configuring the new version of Python as default for Spark in Zeppelin using this tuto : https://community.hortonworks.com/content/supportkb/146508/how-to-use-alternate-python-version-for-s...
Now It works !
Created ‎03-07-2018 04:03 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
And after Python 2.7 installation don't forget to change Zeppelin Spark interpreter setting as:
zeppelin.pyspark.python | python2.7 |
