Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

[TEZ] are partition, sort and shuffle built-in ?

Solved Go to solution
Highlighted

[TEZ] are partition, sort and shuffle built-in ?

New Contributor

Hello,

If the concept of MapReduce is pretty clear in my mind, i can't say so much for Tez.

MapReduce performs its work through Map > Partition, Sort, Shuffle > Reduce, and I know well each of these phases...

But for Tez, and more precisely, between two Vertices (considering a Vertices Map and a Vertices Reduce), how is it ?

Is there a built-in "partition, sort, shuffle" like in MR ? Or is it to us to manage this internal logic (i read a word count example, it seems it is, but I prefer to be sure) ?

Thanks !

1 ACCEPTED SOLUTION

Accepted Solutions

Re: [TEZ] are partition, sort and shuffle built-in ?

@Sebastien F

Background execution of tez and mr has many similarities. Differences lies in the where the data are in placed to transform it. Tez uses DAG to process the data whereas mr doesn't use DAG.

This link would answer your question. Hope it helps!!

2 REPLIES 2

Re: [TEZ] are partition, sort and shuffle built-in ?

@Sebastien F

Background execution of tez and mr has many similarities. Differences lies in the where the data are in placed to transform it. Tez uses DAG to process the data whereas mr doesn't use DAG.

This link would answer your question. Hope it helps!!

Re: [TEZ] are partition, sort and shuffle built-in ?

New Contributor

thanks @Bala Vignesh N V ; it helps :)