- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Spark SQL?
- Labels:
-
Apache Hadoop
-
Apache Spark
Created on 01-26-2016 02:36 PM - edited 09-16-2022 03:00 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Do we need to know Spark SQL for the CCA Spark and Hadoop certi?
Created 01-26-2016 02:41 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Support for Spark SQL is being added into CDH5.5 As of today, the exam is running on CDH5.3.2
So the answer is "not yet", but that will almost certainly change in the near future.
Watch the Cloudera website: http://www.cloudera.com/training/certification/cca-spark.html
The list of required skills should give you knowledge of what technologies you will need to know.
Created 01-26-2016 02:41 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Support for Spark SQL is being added into CDH5.5 As of today, the exam is running on CDH5.3.2
So the answer is "not yet", but that will almost certainly change in the near future.
Watch the Cloudera website: http://www.cloudera.com/training/certification/cca-spark.html
The list of required skills should give you knowledge of what technologies you will need to know.
Created 02-09-2016 05:52 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
CDH 5.3.0 ships with Spark 1.2.0 which in turn ships with support for Spark SQL. So I guess all CDH >= 5.3.0 must support Spark SQL. Unless CDH explicitly comes without Spark SQL support...
See http://spark.apache.org/docs/1.2.1/sql-programming-guide.html
Created 02-09-2016 06:12 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
http://www.cloudera.com/documentation/enterprise/5-3-x/topics/cdh_rn_spark_ki.html
SparkSQL just exited alpha and is far from stable. As such, SparkSQL is
currently considered a “preview” in CDH. We love it and we’re
dedicating a lot of engineering resources to bring it to our standards
but as I’m sure you’re aware, it’s mainly Scala (pyspark lags),
it’s very buggy, it causes all kinds of havoc (esp. with Hive)….the
list goes on.
Once we get it running at scale, we’ll support it fully in our
distribution and we’ll test it. But today, it’s just not ready.
