Community Articles

anandi · ‎06-28-2017

Setting Up a Data Science Platform on HDP using Anaconda

Building a Data Science Platform using Anaconda needs to be able to

Launch PySpark jobs on the cluster
Synchronize python libraries from vetted public repositories
Isolate environments with specific dependencies to run production jobs using an older version of a package whilst simultaneously running new version of the package
Launching notebooks and PySpark jobs using different kernels such as Python_2.7, Python_3.x, R, Scala

Framework of the Data Science Platform

Building blocks of the Data Science Platform

akuardit · ‎12-15-2017

And how to implement this, step how to install? how to install on existing HDP cluster?

Setting Up a Data Science Platform on HDP using Anaconda