Created 11-24-2016 01:03 PM
Hi everyone!
I am working with the Pyspark interpreter in a Zeppelin notebook, and I want to use "pandas" library functionalites, but when I try this command:
import pandas as pd
I get the next error message:
Traceback (most recent call last): File "/tmp/zeppelin_pyspark-2633231603377305574.py", line 239, in <module> eval(compiledCode) File "<string>", line 1, in <module> ImportError: No module named pandas
I have already installed Pandas in my Virtual Machine where Zeppelin is running, and restart ambari-server as it's explained in the next post:
How could I do?
Created 11-24-2016 02:17 PM
You might need to restart the Spark Interpreter (or restart Zeppelin notebook in Ambari, so that the Python Remote Interpreters know about the freshly installed pandas and import it
If you are you running on a cluster, then Zeppelin will run in yarn client mode and the Python Remote Interpreters are started on other nodes than the zeppelin node. In this case install pandas on all machines of your cluster and restart Zeppelin.
Created 11-24-2016 02:17 PM
You might need to restart the Spark Interpreter (or restart Zeppelin notebook in Ambari, so that the Python Remote Interpreters know about the freshly installed pandas and import it
If you are you running on a cluster, then Zeppelin will run in yarn client mode and the Python Remote Interpreters are started on other nodes than the zeppelin node. In this case install pandas on all machines of your cluster and restart Zeppelin.