Support Questions

Find answers, ask questions, and share your expertise

Error running spark2 interpreter in Zeppelin with pyspark due to matplotlib

New Contributor

I installed sandbox using Docker for Windows image.

Sandbox information:                                                                                                                        
Created on: 01_02_2018_10_47_41                                                                                                             
Hadoop stack version:  Hadoop                                                                                              
Ambari Version:                                                                                                                 
Ambari Hash: 2989989d67edacff7e9db702b4cf0c080556dddc                                                                                       
Ambari build:  Release : 143                                                                                                                
Java version:  1.8.0_161

OS Version:  CentOS release 6.9 (Final)                                                                                                        

In Zeppelin, when I attempt to use spark2 interpreter with pyspark


x = 1


I get the following error:

/usr/hdp/current/spark2-client/python/pyspark/ UserWarning: Support for Python 2.6 is deprecated as of Spark 2.0.0 warnings.warn("Support for Python 2.6 is deprecated as of Spark 2.0.0") Traceback (most recent call last): File "/tmp/", line 302, in <module> __zeppelin__._setup_matplotlib() File "/tmp/", line 141, in _setup_matplotlib import backend_zinline File "/usr/hdp/current/zeppelin-server/interpreter/lib/python/", line 30, in <module> import mpl_config File "/usr/hdp/current/zeppelin-server/interpreter/lib/python/", line 99, in <module> _init_config() File "/usr/hdp/current/zeppelin-server/interpreter/lib/python/", line 83, in _init_config fmt = matplotlib.rcParams['savefig.format'] KeyError: 'savefig.format


@Patrick Young Can you check your python version on the cluster nodes? Is it 2.6 by any chance?

New Contributor

In hdp shell if I type python --version it is 2.6.6

I am confused why the sandbox image would use a deprecated version of python. If it needs python3 to work properly, then python3 should be installed by default when the Virtual Machine is first created.

I tried installing python3 in the Centos OS and restarting the hdp services but without success.

Please could you give me step by step instructions to go from a fresh install on VMWare Workstation with the error detailed above to a state where I am able to use Zeppelin with spark2 and pyspark. It would be much appreciated. Thank you.

Hi @Patrick Young

You need to follow many steps to make this works.

About python:

- I installed anaconda3 and the critical step is do not let anaconda3 to be configured in the environment variables. HDP platform needs python 2 for some scripts and the python path needs to be resolved to a python 2 installation.

Since I want to have spark and spark2 interpreters I commented the SPARK_HOME line in the file:


Another configuration I changed in this file:

According to the documentation, the variable ZEPPELIN_JAVA_OPTS changed in spark2 to ZEPPELIN_INTP_JAVA_OPTS. Since both versions are active these two variables are defined:

exportZEPPELIN_JAVA_OPTS="-Dhdp.version=None -Dspark.executor.memory=512m -Dspark.executor.instances=2 -Dspark.yarn.queue=default"

export ZEPPELIN_INTP_JAVA_OPTS="-Dhdp.version=None -Dspark.executor.memory=512m -Dspark.executor.instances=2 -Dspark.yarn.queue=default"

- You need to configure the spark2 interpreter as follow:



I also created a Python interpreter:


Finally I created a symbolic link to be able to find conda

Create symlink to /bin/conda: ln -s /opt/anaconda3/bin/conda /bin/conda

Of course you have to adjust the paths above to your paths.

Hope that helps.

Kind regards, Paul