Is there a way to get Hive queries to run on Spark 2.x with CDH 5.10.x or higher?
This post makes me think there is:
But then this doc link makes me think there is not (yet):
If not, is there any info on when we will be able to run Hive queries on CDH using Spark 2.x?
At the time of this writing (latest version are CDH 5.13.1 / Spark 2.2.x) Hive on Spark2 is not supported. See our documentation:
"Hive-on-Spark is a CDH component that has a dependency on Spark 1.6. Because CDH components do not have any dependencies on Spark 2, Hive-on-Spark does not work with the Cloudera Distribution of Apache Spark 2."
The referenced Spark 2 announcement was about a Hive issue inside Spark2 (which means it appeared when using HiveContext inside Spark2).
Currently we do not have plans / schedules when Hive on Spark2 will be supported. Until then of course you can use Spark2 (spark2-shell, spark2-submit) and execute Hive queries using HiveContext.
I hope this answers your question.
Customer Operations Engineer
I am trying to use Hive on Spark2 , but I am not able to activate it.
I am using CDH 5.16.2 and installed Spark2 ( with the help of CSD) SPARK2_ON_YARN-2.4.0.
May I know whether Hive on Spark2 is still not supported?
Thanks in advance.
With CDH 5.x, Hive on Spark2 is still not supported.
In CDH 6.x versions only Spark2 is available (there is no more separate parcel for it), so with that you can have Hive on Spark2 and as such it is supported too.