Support Questions

Find answers, ask questions, and share your expertise

HELP with Pandas into zeppelin

Hello guys,

I still have this problem. I have installed pandas into system. After installation I have restart the zeppelin notebook.

But still It shows me the messae: no module named pandas. Someone gave me advice, that I should verify spark interpreter settings config "zeppelin.pyspark.python". I tried put there anaconda instead of python, but it didnt work. I tried put the location of anacoda /root/yes/bin/anaconda too but it didnt work. So what sould I put there? Does anybody know how to solve this problem? Or how could I import pandas module insto Hortonworks in the easiest way?




@enzo EL,

I see two different versions of python in your screenshots. In zeppelin notebook error, I see Python 2.6 is deprecated. In your terminal I see python 3.6.3 which is part of Anaconda.

If you want to use python from anaconda ,export these env variable in Ambari (Advanced zeppelin-env ->zeppelin_env_content )

PYSPARK_DRIVER_PYTHON= <path to anaconda python>

In your interpreter, change zeppelin.pyspark.python = <path to anaconda python>



Hi enzo,

I may have the answer but I need a couple of details :

  • Hortonworks version
  • Zeppelin version
  • Anaconda and python version
  • Is an ambari managed zeppelin or stand alone

Kind regards,



Hi @Aditya Sirna, I cant find PYSPARK_DRIVER_PYTHON in the Advanced zeppelin-env ->zeppelin_env_content .

Check the configuration in the attachment. So shoud I add a new line into this configuration file?

export PYSPARK_DRIVER_PYTHON= /root/yes/anaconda

and change zeppelin.pyspark.python = /root/yes/anaconda. Like this?



e this?

Hi @enzo EL

If you can't find this property please add it

These are my settings for a Zeppelin stand-alone version 0.7.3 with HDP 2.5 and anaconda3 with Python 3.5

(I am using Spark 2.0.0 and the PySpark version does not work well with python 3.6)

export PYTHONPATH="/var/opt/teradata/anaconda3/envs/py35/bin:/usr/hdp/current/spark-client/python/lib/"
export PYSPARK_DRIVER_PYTHON="/var/opt/teradata/anaconda3/envs/py35/bin/python"
export PYSPARK_PYTHON="/var/opt/teradata/anaconda3/envs/py35/bin/python"
export PYLIB="/var/opt/teradata/anaconda3/envs/py35/lib"

In your case:

PYTHONPATH = "path-to-your-python/bin:/usr/hdp/current/spark-client/python/lib/"

PYSPARK_DRIVER_PYTHON = "path-to-your-python/bin/python"

PYSPARK_PYTHON = "path-to-your-python/bin/python"

According to the documentation both variables above must use the same version of python in order to work properly "PySpark requires the same minor version of Python in both driver and workers"

I also added PYLIB to the configuration but I think it is not necesary.

In Zeppelin Interpreters Page I created a new Interpreter for Python:


This should be the result:


You also need to adjust the spark interpreter:


The last configuration, and this was extrem tricky, do not add python3 to your PATH env!

Instead create a symlink to conda

Example:  ln -s /opt/anaconda3/bin/conda /bin/conda

Or to another location existing in your PATH like /user/lib or /var/lib etc.

Additionally you can test the installation as I did:

Python interpreter test


Pyspark and Pandas interaction: From dataFrame to Pandas DataFrame


I hope this helps you.

Kind regards,



Hi, @Paul Hernandez, where could I find "path-to-your-python/bin/python". When type which anaconda it gave me back

/root/yes/bin/anaconda. So should I put this as a path? That means:

export PYTHONPATH="/root/yes/bin/anaconda/bin:/usr/hdp/current/spark-client/python/lib/" export SPARK_YARN_USER_ENV="PYTHONPATH=${PYTHONPATH}" export PYSPARK_DRIVER_PYTHON="/root/yes/bin/anaconda."

export PYSPARK_PYTHON="/root/yes/bin/anaconda"

export PYLIB="/var/opt/teradata/anaconda3/envs/py35/lib"¨ - here I cant find this folder..

Thank you for helping me.

Hi @enzo EL

In your case:

export PYTHONPATH="/root/yes/bin/anaconda/bin:/usr/hdp/current/spark-client/python/lib/" 
export PYSPARK_DRIVER_PYTHON="/root/yes/bin/anaconda/bin/python"
export PYSPARK_PYTHON="/root/yes/bin/anaconda/bin/python"

This should works.

BTW. Why have you installed anaconda in this location?

Kind regards, Paul

@Paul Hernandez

The location was by default, I dont know. I have another question, I am trying to create python interpret, but I cant edit

interpreter group. I am following your guide. How could I create it?


@enzo EL

what is your Zeppelin version? It seems like an 0.6.x version

You may be able to use Pandas without configuring the python interpreter.

My version is (screenshot). I have changed the path, restart the notebook but it doesnt work..:/

What should I do, I need pandas to plotting..:/


Hi @enzo EL

1) If you just need pandas with pyspark, just test it with the example I provided for the spark interpreter

2) It seems like the python interpreter is first available with Zeppelin 0.7.2. Is an upgrade possible for you?

3) You can add non available interpreters following the official documentation:

I have never done it before but it should work.