Support Questions
Find answers, ask questions, and share your expertise

Run Spark MLlib using cloudera-quickstart-vm-5.7.0-0-virtualbox

Run Spark MLlib using cloudera-quickstart-vm-5.7.0-0-virtualbox

Explorer

Hi experts,

I've some tables in Hive and I want to run some clustering analysis using Spark MLlib in Python. Is possible to do it using the cloudera-quickstart-vm-5.7.0-0-virtualbox? 

There exists any tutorial that shows how I work with Spark MLlib?

Many thanks!

1 REPLY 1
Highlighted

Re: Run Spark MLlib using cloudera-quickstart-vm-5.7.0-0-virtualbox

Explorer
The cloudera-quickstart-vm-5.7 comes with SPARK-Scala, Python can be installed on the vm but there is a word of caution as the problem I faced after I installed the Scala into Eclipse.

The jar files which Scala was implementing had a version conflict with the vm jar files.

I am not sure if the same problem will incur in Python or not, if not then you are good to install Python in vm