Created on 10-10-2014 01:30 AM - edited 09-16-2022 02:09 AM
Hi,
Could anyone tell me which language is good to work on Spark, scala or java or python and also tell me why?
Ravi
Created 10-10-2014 01:36 AM
Scala is the native language of Spark. All else equal, it will be easiest to use Spark in Scala. However, of course, not everyone knows Scala or is using it in other projects.
Using it from Java is only slightly less convenient. You will write more code since Java's handling of anonymous classes is quite verbose before Java 8. All of the Scala APIs can be called from Java too, although some look weird when accessed from Java. Most APIs have a Java-friendlier version where necessary to ease this integration.
Python is probably the least easy to use since it is not JVM-based. There is a runtime overhead to translating back and forth between Spark and Python. Not all APIs are 'translated' to Python. Still, it works, and is useful if, well, you know Python and want to use it.
Created 10-10-2014 01:36 AM
Scala is the native language of Spark. All else equal, it will be easiest to use Spark in Scala. However, of course, not everyone knows Scala or is using it in other projects.
Using it from Java is only slightly less convenient. You will write more code since Java's handling of anonymous classes is quite verbose before Java 8. All of the Scala APIs can be called from Java too, although some look weird when accessed from Java. Most APIs have a Java-friendlier version where necessary to ease this integration.
Python is probably the least easy to use since it is not JVM-based. There is a runtime overhead to translating back and forth between Spark and Python. Not all APIs are 'translated' to Python. Still, it works, and is useful if, well, you know Python and want to use it.
Created 10-10-2014 01:51 AM
Thanks a lot of the reply. Could you please tell me the best material to start leanring scala and spark?
Created 10-10-2014 01:55 AM
There is a Coursera course on Scala right now -- you can still watch the videos although it started weeks ago: https://www.coursera.org/course/progfun
There are a number of examples and tutorials on the web concerning Spark. Really, take your pick after searching Google. Here's a blog post I wrote with a quick example: http://blog.cloudera.com/blog/2014/03/why-apache-spark-is-a-crossover-hit-for-data-scientists/
Created 10-10-2014 01:57 AM
Thanks a lot...