Created 04-29-2016 08:36 AM
Hello experts,
I already have some data in HDFS, and some tables with Hive, I would like to do some data analysis/data mining. But I don't know If I use Spark or other Analytics Tool.
What type of analysis/analytics work can I do with Spark?
Created 04-29-2016 08:39 AM
Hello Pedro
Spark core is a general purpose in memory analytics engine. Adding to spark core things like sparkSQL or SparkML you can do many interesting analytics or Datascience modelling, in a programatic or sql fashion. Maybe this tutorial can help you in your first steps.
http://hortonworks.com/hadoop-tutorial/hands-on-tour-of-apache-spark-in-5-minutes/
http://hortonworks.com/blog/data-science-hadoop-spark-scala-part-2/
Created 04-29-2016 08:39 AM
Hello Pedro
Spark core is a general purpose in memory analytics engine. Adding to spark core things like sparkSQL or SparkML you can do many interesting analytics or Datascience modelling, in a programatic or sql fashion. Maybe this tutorial can help you in your first steps.
http://hortonworks.com/hadoop-tutorial/hands-on-tour-of-apache-spark-in-5-minutes/
http://hortonworks.com/blog/data-science-hadoop-spark-scala-part-2/
Created 04-30-2016 01:25 PM
for more information ..google search on spark 1.6.1 or http://spark-1.6.1.org
it details more about dataframes,SQL,HiveQl,graphx,machine learning ,R etc. with examples.