05-03-2017 09:36 AM - edited 05-03-2017 09:47 AM
Is there a way to get Hive queries to run on Spark 2.x with CDH 5.10.x or higher?
This post makes me think there is:
But then this doc link makes me think there is not (yet):
If not, is there any info on when we will be able to run Hive queries on CDH using Spark 2.x?
01-04-2018 02:02 AM
At the time of this writing (latest version are CDH 5.13.1 / Spark 2.2.x) Hive on Spark2 is not supported. See our documentation:
"Hive-on-Spark is a CDH component that has a dependency on Spark 1.6. Because CDH components do not have any dependencies on Spark 2, Hive-on-Spark does not work with the Cloudera Distribution of Apache Spark 2."
The referenced Spark 2 announcement was about a Hive issue inside Spark2 (which means it appeared when using HiveContext inside Spark2).
Currently we do not have plans / schedules when Hive on Spark2 will be supported. Until then of course you can use Spark2 (spark2-shell, spark2-submit) and execute Hive queries using HiveContext.
I hope this answers your question.
Customer Operations Engineer