09-24-2015 04:10 AM
In the definitive guide of hadoop it is mentioned that
"When all the map outputs have been copied, the reduce task moves into the sort phase (which should properly be called the merge phase, as the sorting was carried out on the map side)".
Does this signify that there is no sorting done during the sort phase? Because we get the map partitions from different mappers and all these are not completely sorted, but are just sorted at the partition level.
But, the above statement sounds like these are not sorted in the merge phase or sort phase of reduce side shuffle and sort.
09-24-2015 05:35 AM