Created 02-04-2016 10:27 AM
Cluster Config:
=========
CDH 5.5
Hive 1.1
spark 1.5
I'm following follow the guideline from
http://www.cloudera.com/documentation/enterprise/latest/topics/admin_hos_oview.html for setting up hive to execute on Spark engine.
Simple select statements, group by works fine. When i run a multi join hive statement on MR engine it completes in a minute and the same on Spark engine runs for hours and fails with "ExecutorLostFailure (executor 2 lost)" .
any help is much appreciated.
Created 02-25-2016 07:46 PM
Hive on Spark is not officially supported and what you see is a one of those cases. Certain queries are slower, take more memory or fail. That is why it is not supported yet. We are working hard to fix and tune these use cases. Until that is done the only workaround is to fall back on the MR execution engine.
Wilfred
Created on 03-03-2016 04:52 AM - edited 03-03-2016 04:53 AM
Am not able to run Hive select statments from spark.
I can able to run show databases, tables using hive sql context.
Syntax i tried : sqlContext.sql("FROM table SELECT state ")
Error:
DataTypeException: Unsupported dataType: char(1). If you have a struct and a field name of it has any special characters , please use backticks (`) to quote that field name, e.g. `x+y`. Please note that backtick itself is not supported in a field name.
Can you please help?
Created 03-07-2016 01:49 PM
How about using Spark SQL as that supports tables in Hive metastore?