Support Questions

Find answers, ask questions, and share your expertise

Hadoop MapReduce sorting order

avatar
New Contributor

I have developed MapReduce job which has - - One input split and thus one map task - map i/p key is Path and value is BytesWritable (one min log file) - map o/p key is Text and value is also Text (Record from min file) - I have configured no of reducers to 1

I know map outputs will get sorted based on natural order of key, but what will be the order for the records having same key? Will it be based on First-In-First-Out? Lets say for same key map emits output id FIFO basis, will that same order get preserved when it comes to reducer ? Also, if you can let me know if this behavior is same in hadoop 1.X and 2.X?

Note: I have not implemented Secondary sort.

1 ACCEPTED SOLUTION

avatar
Super Guru

For equal rowkey, random, otherwise, sorted by rowkey as a String. Same behavior in Hadoop 1.X and 2.X.

View solution in original post

1 REPLY 1

avatar
Super Guru

For equal rowkey, random, otherwise, sorted by rowkey as a String. Same behavior in Hadoop 1.X and 2.X.