Reply
Highlighted
Explorer
Posts: 7
Registered: ‎12-01-2015

Spark 2.x and Hive

[ Edited ]

Is there a way to get Hive queries to run on Spark 2.x with CDH 5.10.x or higher?

 

This post makes me think there is:

 

http://community.cloudera.com/t5/Community-News-Release/ANNOUNCE-Spark-2-0-Release-2/m-p/51464/highl...

 

But then this doc link makes me think there is not (yet):

 

https://www.cloudera.com/documentation/spark2/latest/topics/spark2_known_issues.html#hive_on_spark

 

If not, is there any info on when we will be able to run Hive queries on CDH using Spark 2.x?

Cloudera Employee
Posts: 29
Registered: ‎11-04-2015

Re: Spark 2.x and Hive

Hello Medloh,

 

At the time of this writing (latest version are CDH 5.13.1 / Spark 2.2.x) Hive on Spark2 is not supported. See our documentation:

https://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_rn_hive_ki.html#hive_on_s...

 

"Hive-on-Spark is a CDH component that has a dependency on Spark 1.6. Because CDH components do not have any dependencies on Spark 2, Hive-on-Spark does not work with the Cloudera Distribution of Apache Spark 2."

 

The referenced Spark 2 announcement was about a Hive issue inside Spark2 (which means it appeared when using HiveContext inside Spark2).

 

Currently we do not have plans / schedules when Hive on Spark2 will be supported. Until then of course you can use Spark2 (spark2-shell, spark2-submit) and execute Hive queries using HiveContext.

 

I hope this answers your question.

 

Best regards

 Miklos Szurap

Customer Operations Engineer

Announcements