Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Which option is better to use, spark as an execution engine or spark application with spark SQL

Which option is better to use, spark as an execution engine or spark application with spark SQL

Explorer

We have a project where currently Shell script, Hive, Execution engine: TEZ is being used. For POC purpose we tried replacing shell scripts with spark and we executed HQLs through spark . One of the client cam back with a question that why would we need spark application as we can set spark as an execution engine and we can run our regular shell scripts and oozie workflow. What is the better option to choose just choose

  1. set hive.execution.engine=spark; OR make spark application and execute HQLs with spark APIs. If performance is same for both of them then why do we need to write code in Spark? What is the advantage of writing spark application using SPARK SQL?
2 REPLIES 2
Highlighted

Re: Which option is better to use, spark as an execution engine or spark application with spark SQL

Contributor

Hi @HDave

When SparkSQL uses hive

SparkSQL can use HiveMetastore to get the metadata of the data stored in HDFS. This metadata enables SparkSQL to do better optimization of the queries that it executes. Here Spark is the query processor.

When Hive uses Spark See the JIRA entry: HIVE-7292

Here the the data is accessed via spark. And Hive is the Query processor. So we have all the deign features of Spark Core to take advantage of. But this is a Major Improvement for Hive but there is certain dependency of version between spark and hive , Link: https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started

Here is already the link on HCC you can view: https://community.hortonworks.com/questions/54740/hive-on-tez-or-hive-query-using-spark-sql.html

Highlighted

Re: Which option is better to use, spark as an execution engine or spark application with spark SQL

Contributor

Hi @HDave

Hope you doing good, did you get the answer you are looking for?

if yes, Can you please provide the feedback and marked thread as close.

Thanks

Vikas Srivastava

Don't have an account?
Coming from Hortonworks? Activate your account here