Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hive on Spark Issue

Hive on Spark Issue

Cluster Config:

=========

CDH 5.5

Hive 1.1

spark 1.5

 

I'm following  follow the guideline from 
http://www.cloudera.com/documentation/enterprise/latest/topics/admin_hos_oview.html for setting up hive to execute on Spark engine.

Simple select statements, group by works fine. When i run a multi join hive statement on MR engine it completes in a minute and the same on Spark engine runs for hours and fails with "ExecutorLostFailure (executor 2 lost)" .

any help is much appreciated.

3 REPLIES 3

Re: Hive on Spark Issue

Super Collaborator

Hive on Spark is not officially supported and what you see is a one of those cases. Certain queries are slower, take more memory or fail. That is why it is not supported yet. We are working hard to fix and tune these use cases. Until that is done the only workaround is to fall back on the MR execution engine.

 

Wilfred

Re: Hive on Spark Issue

Explorer

Am not able to run Hive select statments from spark.

 

I can able to run show databases, tables using hive sql context.

 

Syntax i tried : sqlContext.sql("FROM table SELECT state ")

 

Error:

DataTypeException: Unsupported dataType: char(1). If you have a struct and a field name of it has any special characters , please use backticks (`) to quote that field name, e.g. `x+y`. Please note that backtick itself is not supported in a field name.

 

 

Can you please help?

 

Re: Hive on Spark Issue

Explorer

How about using Spark SQL as that supports tables in Hive metastore?

Don't have an account?
Coming from Hortonworks? Activate your account here