Support Questions

sebastien_frack · ‎12-27-2017

Hello,

If the concept of MapReduce is pretty clear in my mind, i can't say so much for Tez.

MapReduce performs its work through Map > Partition, Sort, Shuffle > Reduce, and I know well each of these phases...

But for Tez, and more precisely, between two Vertices (considering a Vertices Map and a Vertices Reduce), how is it ?

Is there a built-in "partition, sort, shuffle" like in MR ? Or is it to us to manage this internal logic (i read a word count example, it seems it is, but I prefer to be sure) ?

Thanks !

balavignesh_nag · ‎12-28-2017

@Sebastien F

Background execution of tez and mr has many similarities. Differences lies in the where the data are in placed to transform it. Tez uses DAG to process the data whereas mr doesn't use DAG.

This link would answer your question. Hope it helps!!

View solution in original post

balavignesh_nag · ‎12-28-2017

@Sebastien F

Background execution of tez and mr has many similarities. Differences lies in the where the data are in placed to transform it. Tez uses DAG to process the data whereas mr doesn't use DAG.

This link would answer your question. Hope it helps!!

sebastien_frack · ‎01-01-2018

thanks @Bala Vignesh N V ; it helps 🙂

Cloudera Community

Support Questions

[TEZ] are partition, sort and shuffle built-in ?

Understanding Tez Application submission and its f...

How can I sort record in parquet file?

Hive on Tez Performance Tuning - Determining Reduc...

Demystify Apache Tez Memory Tuning - Step by Step

Number of intermediate files with Sort shuffle in ...

Failure after tez shuffle handler setup : Map oper...

NiFi Shuffle - Design Pattern

Not able to delete an inifinite loop built with fu...

Support for external shuffle services

What is the difference between Partitioner, Combin...