I have developed MapReduce job which has -
- One input split and thus one map task
- map i/p key is Path and value is BytesWritable (one min log file)
- map o/p key is Text and value is also Text (Record from min file)
- I have configured no of reducers to 1
I know map outputs will get sorted based on natural order of key, but
what will be the order for the records having same key? Will it be
based on First-In-First-Out? Lets say for same key map emits output id
FIFO basis, will that same order get preserved when it comes to reducer ?
Also, if you can let me know if this behavior is same in hadoop 1.X and
2.X?
Note: I have not implemented Secondary sort.