Created on 05-03-2017 09:36 AM - edited 05-03-2017 09:47 AM
Is there a way to get Hive queries to run on Spark 2.x with CDH 5.10.x or higher?
This post makes me think there is:
But then this doc link makes me think there is not (yet):
https://www.cloudera.com/documentation/spark2/latest/topics/spark2_known_issues.html#hive_on_spark
If not, is there any info on when we will be able to run Hive queries on CDH using Spark 2.x?
Created 01-04-2018 02:02 AM
Hello Medloh,
At the time of this writing (latest version are CDH 5.13.1 / Spark 2.2.x) Hive on Spark2 is not supported. See our documentation:
"Hive-on-Spark is a CDH component that has a dependency on Spark 1.6. Because CDH components do not have any dependencies on Spark 2, Hive-on-Spark does not work with the Cloudera Distribution of Apache Spark 2."
The referenced Spark 2 announcement was about a Hive issue inside Spark2 (which means it appeared when using HiveContext inside Spark2).
Currently we do not have plans / schedules when Hive on Spark2 will be supported. Until then of course you can use Spark2 (spark2-shell, spark2-submit) and execute Hive queries using HiveContext.
I hope this answers your question.
Best regards
Miklos Szurap
Customer Operations Engineer
Created 08-26-2019 10:41 AM
Hi @mszurap
I am trying to use Hive on Spark2 , but I am not able to activate it.
I am using CDH 5.16.2 and installed Spark2 ( with the help of CSD) SPARK2_ON_YARN-2.4.0.
May I know whether Hive on Spark2 is still not supported?
Thanks in advance.
Created 08-26-2019 01:08 PM
Hi,
With CDH 5.x, Hive on Spark2 is still not supported.
In CDH 6.x versions only Spark2 is available (there is no more separate parcel for it), so with that you can have Hive on Spark2 and as such it is supported too.
Created 08-26-2019 01:15 PM
Thanks @mszurap for quick reply.