Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Hadoop MapReduce sorting order

avatar
New Member

I have developed MapReduce job which has - - One input split and thus one map task - map i/p key is Path and value is BytesWritable (one min log file) - map o/p key is Text and value is also Text (Record from min file) - I have configured no of reducers to 1

I know map outputs will get sorted based on natural order of key, but what will be the order for the records having same key? Will it be based on First-In-First-Out? Lets say for same key map emits output id FIFO basis, will that same order get preserved when it comes to reducer ? Also, if you can let me know if this behavior is same in hadoop 1.X and 2.X?

Note: I have not implemented Secondary sort.

1 ACCEPTED SOLUTION

avatar
Super Guru

For equal rowkey, random, otherwise, sorted by rowkey as a String. Same behavior in Hadoop 1.X and 2.X.

View solution in original post

1 REPLY 1

avatar
Super Guru

For equal rowkey, random, otherwise, sorted by rowkey as a String. Same behavior in Hadoop 1.X and 2.X.