HIve on Tez or HIve query using Spark SQL

Chandra — Sat, 03 Sep 2016 02:44:01 GMT

Hi,

Can you please let me know which one is faster -Hive on Tez or accessing Hive using Spark SQL.

Thanks,

Chandra

Re: HIve on Tez or HIve query using Spark SQL

cstanca — Sat, 03 Sep 2016 03:07:34 GMT

@chandramouli muthukumaran

Just to clarify, SparlSQL does not access or use Hive engine. It just consumes the metadata of Hive data structures.

Assuming that both can execute the query functionally (SparkSQL is quite limited functionally compared with Hive), but the query will need to churn through 40 TB of data, then I would say likely Hive on Tez is your optimal choice. That is also driven by the cost associated with your Spark cluster RAM additional to Hive's requirements because I assume that you will still have some cases where running Hive is needed. I noticed that if the amount of data is less than 1 TB, SparkSQL outperforms Hive on Tez.

Anyhow, be aware, that with HDP 2.5 LLAP is in Tech Preview and soon will be GA. If you were asking Hive on LLAP vs. SparkSQL, I would say without hesitation for most of the queries, Hive on LLAP. Again, for some sofisticated queries with limited amount of data, and limited function, SparkSQL may be a winner, but in the big picture is too expensive to maintain both approaches and I would still consider Hive on Tez and LLAP over SparkSQL for most of the cases that deal with BIG DATA. Otherwise, 1 TB does not need Hadoop for fast queries.

Re: HIve on Tez or HIve query using Spark SQL

Chandra — Sat, 03 Sep 2016 03:16:55 GMT

Thanks for your valuable information. So your recommendation is to go for Hive on LLAP rather than SparkSQL. Please correct me if I am wrong.

Re: HIve on Tez or HIve query using Spark SQL

Chandra — Sat, 03 Sep 2016 04:19:45 GMT

Also what is the need to run Hive queries on SparkSql when Hive on Tez can run much faster....

question Re: HIve on Tez or HIve query using Spark SQL in Archives of Support Questions (Read Only)

HIve on Tez or HIve query using Spark SQL

Re: HIve on Tez or HIve query using Spark SQL

Re: HIve on Tez or HIve query using Spark SQL

Re: HIve on Tez or HIve query using Spark SQL