Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Spark 2.x and Hive

Spark 2.x and Hive

Explorer

Is there a way to get Hive queries to run on Spark 2.x with CDH 5.10.x or higher?

 

This post makes me think there is:

 

http://community.cloudera.com/t5/Community-News-Release/ANNOUNCE-Spark-2-0-Release-2/m-p/51464/highl...

 

But then this doc link makes me think there is not (yet):

 

https://www.cloudera.com/documentation/spark2/latest/topics/spark2_known_issues.html#hive_on_spark

 

If not, is there any info on when we will be able to run Hive queries on CDH using Spark 2.x?

1 REPLY 1
Highlighted

Re: Spark 2.x and Hive

Contributor

Hello Medloh,

 

At the time of this writing (latest version are CDH 5.13.1 / Spark 2.2.x) Hive on Spark2 is not supported. See our documentation:

https://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_rn_hive_ki.html#hive_on_s...

 

"Hive-on-Spark is a CDH component that has a dependency on Spark 1.6. Because CDH components do not have any dependencies on Spark 2, Hive-on-Spark does not work with the Cloudera Distribution of Apache Spark 2."

 

The referenced Spark 2 announcement was about a Hive issue inside Spark2 (which means it appeared when using HiveContext inside Spark2).

 

Currently we do not have plans / schedules when Hive on Spark2 will be supported. Until then of course you can use Spark2 (spark2-shell, spark2-submit) and execute Hive queries using HiveContext.

 

I hope this answers your question.

 

Best regards

 Miklos Szurap

Customer Operations Engineer