Created 10-19-2017 04:26 PM
Hi experts,
Just curious to know about the differences between Spark MLlib/ML and H2O in terms of implementation of algorithms, performance and usability and which one is better in what kinds of use-cases?
Thanks a lot in advance.
Created 10-19-2017 04:57 PM
You will have to run your algorithms on your cluster with your data to get a reasonable performance analysis.
What language are you looking at?
The Python Spark interface is pretty clean.
http://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science.html
H2O has a few more algorithms than Spark MLib.
https://spark.apache.org/docs/latest/ml-classification-regression.html
Created 10-19-2017 04:57 PM
You will have to run your algorithms on your cluster with your data to get a reasonable performance analysis.
What language are you looking at?
The Python Spark interface is pretty clean.
http://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science.html
H2O has a few more algorithms than Spark MLib.
https://spark.apache.org/docs/latest/ml-classification-regression.html
Created 10-19-2017 08:40 PM
Thanks @Timothy Spann for your answer. These links are really helpful. I used python for Spark MLlib so will use the same for H2O as well.