question Re: Spark vs Tez? in Support Questions

Choosing Between Spark and Tez: Use Cases for Large Datasets and Hive Integration

abajwa — Tue, 12 May 2026 20:55:39 GMT

Whats the difference between the two?

Re: Spark vs Tez?

vshukla — Tue, 08 Dec 2015 13:23:12 GMT

There are many differences between the two.

Spark:

Spark provides API, execution engine and Packages (SQL, ML, Graph) on top of the core Spark API
Spark is application developer facing
Sparks abstractions are RDD/DataFrame & now DataSet (with Spark 1.6)

Tez

Tez is the execution engine for Hive & PIG

Bottom line, if are asking for the difference between Spark & Tez, consider using Spark.

Re: Spark vs Tez?

nshawa — Tue, 08 Dec 2015 13:26:27 GMT

From what we have witnessed in the field and during some customers testing, SparkSQL (1.4.x) at the time of testing was generally 50% - %200 faster when querying small datasets, by small we mean anywhere < 100GB datasets, which is usually great for data discovery, data wrangling, testing stuff out, or even running a production usecase where the datasets tend to be a lot but relatively small.

the bigger the table especially when joins are not effectively used or we are scanning a single one big table, and if you are in the BI space, and SLAs are required and you cant afford a query to break and start over, Tez was able to shine, its rigid stable, and the bigger the datasets the better the performance gets compared to Spark, at a 250GB datasets you will see a lot of similarities on the execution time, of course this will depend on how big is the cluster, how much memory allocated..etc

in general, my personal opinion we shouldn't compare both at this time as both shine in seperate contexts, at some stage Tez might be needed but maybe more Spark would be required in smaller datasets, and as I mentioned that was based on Spark 1.4.x , would love to re-run the testings again especially after the new cube functionalities in Spark 1.5.

hope this was helpful.

Re: Spark vs Tez?

anandmurari — Wed, 09 Dec 2015 22:44:55 GMT

Spark is a framework and written in Scala, and richer support for Python and Java API's. Scala is based on

functional programming and easy for applications written in Scala.

Re: Spark vs Tez?

dkumar1 — Mon, 14 Dec 2015 01:49:15 GMT

Spark is meant for application development. Tez is a library which is used by tools such as Hive to speed things up. Tez isn't suitable for end-user programming.