Created 03-08-2016 02:25 PM
Hello Experts!
I noticed that the default zeppelin install on hdp sandbox does not come with numpy and scikit learn packages. I can install pip manager and install the packages manually, but i want to make sure the zeppelin installation will pickup those packages. Anyone added these packages to their cluster?
This is the error that i am getting:
Traceback (most recent call last): File "/tmp/zeppelin_pyspark.py", line 162, in <module> eval(compiledCode) File "<string>", line 1, in <module>ImportError: No module named numpy
the zeppelin property in ambari zeppelin.install_python_packages is set to false. I tried switching it to true but it does not do anything, its an install only read property i am assuming than.
Thanks!
Created 03-08-2016 03:10 PM
pyspark required numpy in my case, didn't try it in Zepplin but it wouldn't work in pyspark either if you don't have it on your machine https://github.com/dbist/datamunging
Created 03-08-2016 03:10 PM
pyspark required numpy in my case, didn't try it in Zepplin but it wouldn't work in pyspark either if you don't have it on your machine https://github.com/dbist/datamunging
Created 03-08-2016 04:32 PM
@Artem Ervits your suggestion worked. This is what i ran to get it to run on my sandbox :
yum install -y numpy