Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Machine learning on large datasets


Machine learning on large datasets

Rising Star

I am experience java programmer and want to shift in Data Science. I would like to apply machine learning algorithm on very very large data set (few TB to PB). Which language is preferable to use - Scala / Python / R ?


Re: Machine learning on large datasets


I recommend scala or python.

R support just stand alone so you have to use spark + R or python + R.

Don't have an account?
Coming from Hortonworks? Activate your account here