Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Difference between Spark MLlib/ML and H20

Solved Go to solution

Difference between Spark MLlib/ML and H20

Super Collaborator

Hi experts,

Just curious to know about the differences between Spark MLlib/ML and H2O in terms of implementation of algorithms, performance and usability and which one is better in what kinds of use-cases?

Thanks a lot in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Difference between Spark MLlib/ML and H20

Super Guru

You will have to run your algorithms on your cluster with your data to get a reasonable performance analysis.

What language are you looking at?

The Python Spark interface is pretty clean.

http://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science.html

H2O has a few more algorithms than Spark MLib.

https://spark.apache.org/docs/latest/ml-classification-regression.html

View solution in original post

2 REPLIES 2
Highlighted

Re: Difference between Spark MLlib/ML and H20

Super Guru

You will have to run your algorithms on your cluster with your data to get a reasonable performance analysis.

What language are you looking at?

The Python Spark interface is pretty clean.

http://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science.html

H2O has a few more algorithms than Spark MLib.

https://spark.apache.org/docs/latest/ml-classification-regression.html

View solution in original post

Highlighted

Re: Difference between Spark MLlib/ML and H20

Super Collaborator

Thanks @Timothy Spann for your answer. These links are really helpful. I used python for Spark MLlib so will use the same for H2O as well.

Don't have an account?
Coming from Hortonworks? Activate your account here