Created on 07-29-2017 11:11 PM - edited 09-16-2022 05:01 AM
Hi there,
I have compiled Hue and configured HiveServer2 and Spark but all of my jobs will run in MapReduce mode instead of Spark on Hive mode even I set 'hive.execution.engine=spark' and 'spark.master=yarn'. How can I do that? I think this is a duty of Oozie but I cannot find anything related to run Hive on Spark in YARN in that.
Thanks in advance,
Mobin
Created 07-30-2017 07:05 PM
Set the hive execution engine to spark is sufficient to run a hive query in spark.
set hive.execution.engine=spark;
But where did you set this and from where did you try to execute your query? There are 3 options
a. In CLI, login as hive/beeline and run the above set command, but this is effective only for that session. You cannot control Oozie with this command. Because Oozie will be a new session.
b. In Hue, login to hue and go to Hive query and run the above command, this is also session specific. You cannot control Oozie with this command.
c. CM -> Hive -> configuration -> set hive.execution.engine to spark, this is a permanent setup and it will control all the session including Oozie
In your case, if you want to try temporarly for a specific query. Run the 'set' command in Oozie itself 'along with your query' as follows
ex:
set hive.execution.engine=spark;
select * from test_table;
Created 07-30-2017 10:00 PM
Thanks for your answer.
I have tested both a and b successfully but the point is I am not using CM(CDH) and I have compiled Hue itself and ran it. I did run my Hive on Spark query via Beeswax successfully, but when I save my document and design a job my job will run in MR mode(hive.execution.engine property is set both in my document and by job.properties).
How can I do that?
Thanks,
Mobin