Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

best way to install/integrate numpy scikit to an existing zeppelin install on sandbox

SOLVED Go to solution
Highlighted

best way to install/integrate numpy scikit to an existing zeppelin install on sandbox

Hello Experts!

I noticed that the default zeppelin install on hdp sandbox does not come with numpy and scikit learn packages. I can install pip manager and install the packages manually, but i want to make sure the zeppelin installation will pickup those packages. Anyone added these packages to their cluster?

This is the error that i am getting:

Traceback (most recent call last):  File "/tmp/zeppelin_pyspark.py", line 162, in <module>    eval(compiledCode)  File "<string>", line 1, in <module>ImportError: No module named numpy

the zeppelin property in ambari zeppelin.install_python_packages is set to false. I tried switching it to true but it does not do anything, its an install only read property i am assuming than.

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions

Re: best way to install/integrate numpy scikit to an existing zeppelin install on sandbox

Mentor
@azeltov

pyspark required numpy in my case, didn't try it in Zepplin but it wouldn't work in pyspark either if you don't have it on your machine https://github.com/dbist/datamunging

2 REPLIES 2

Re: best way to install/integrate numpy scikit to an existing zeppelin install on sandbox

Mentor
@azeltov

pyspark required numpy in my case, didn't try it in Zepplin but it wouldn't work in pyspark either if you don't have it on your machine https://github.com/dbist/datamunging

Re: best way to install/integrate numpy scikit to an existing zeppelin install on sandbox

@Artem Ervits your suggestion worked. This is what i ran to get it to run on my sandbox :

 yum install -y numpy