Support Questions

Find answers, ask questions, and share your expertise

[TEZ] are partition, sort and shuffle built-in ?

avatar

Hello,

If the concept of MapReduce is pretty clear in my mind, i can't say so much for Tez.

MapReduce performs its work through Map > Partition, Sort, Shuffle > Reduce, and I know well each of these phases...

But for Tez, and more precisely, between two Vertices (considering a Vertices Map and a Vertices Reduce), how is it ?

Is there a built-in "partition, sort, shuffle" like in MR ? Or is it to us to manage this internal logic (i read a word count example, it seems it is, but I prefer to be sure) ?

Thanks !

1 ACCEPTED SOLUTION

avatar

@Sebastien F

Background execution of tez and mr has many similarities. Differences lies in the where the data are in placed to transform it. Tez uses DAG to process the data whereas mr doesn't use DAG.

This link would answer your question. Hope it helps!!

View solution in original post

2 REPLIES 2

avatar

@Sebastien F

Background execution of tez and mr has many similarities. Differences lies in the where the data are in placed to transform it. Tez uses DAG to process the data whereas mr doesn't use DAG.

This link would answer your question. Hope it helps!!

avatar

thanks @Bala Vignesh N V ; it helps 🙂