Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hi , Can we use Spark SQL with CDH 5.4.1

Hi , Can we use Spark SQL with CDH 5.4.1

New Contributor

Hi ,

We have recently started project on big data platform . We are using CDH 5.4.1 as environment to run this.

 

I have a understanding that Spark SQL is not supported yet . so would like to understand why we are not supporting this , is there any risk in using this

considering it is not supported.

 

We do not want to reinvent the wheel. if we know specific use case which can't work we can look for alternative.

 

We are intent to use Spark/Hive/Parquet/Avro  and HiveCotext from Spark SQL. as Spark SQL is not supported , we need to understand risk around it as I

bbelieve we are plannign to support from 2015 end or early december

 

7 REPLIES 7

Re: Hi , Can we use Spark SQL with CDH 5.4.1

Super Collaborator

Spark SQL is not suported because it is still in flux, there is a lot that changes from release to release and it is not stable enough for us to consider it as supportable.

We can not say when it will be supported as that woud depend on the progress that is being made in the Spark project. I think it has been discussed before on this mailing list but here it is again: Spark uses an older version of Hive then CDH which has an impact on the SparkSQL side. See the documentation for more information on that.

 

Hive on Spark is in a beta release at the moment and will become part of the supported products soon.

 

Wilfred

Re: Hi , Can we use Spark SQL with CDH 5.4.1

New Contributor

As a Spark user, if we need additional feature of SQL on top of our data sitting on HDFS and Spark Infrastructure.

What is the recomendation for that, Spark SQL fits well with existing technology stacks exposing SPark DataFrames out as SQLs.

 

When could we see Cloudera supporting Spark SQL as well? 

Re: Hi , Can we use Spark SQL with CDH 5.4.1

Master Collaborator

CDH has always included Spark SQL with Spark, and it works as well as Spark SQL can be made to work with the rest of the distribution, but there are no announced plans to support Spark SQL that I know of.

Re: Hi , Can we use Spark SQL with CDH 5.4.1

Cloudera still not supporting spark sql in CDH 5.5? Is there any plans to support?

Re: Hi , Can we use Spark SQL with CDH 5.4.1

Master Collaborator
Spark SQL was supported as of CDH 5.5

Re: Hi , Can we use Spark SQL with CDH 5.4.1

New Contributor

Does that mean the hive thrift service running spark sql is also supported?

And when can we see Hive On Spark with full features authorisation etc.?

Re: Hi , Can we use Spark SQL with CDH 5.4.1

Super Collaborator

No we do not support the thrift server as per the documentation: CDH 5.5 Spark release note

Hive on Spark is also still in beta and we are finishing features as per Hive CDH 5.5 release note it is thus experimental and things might not work. We can not provide guidance on the road map for features that are not yet complete

 

Wilfred