Which is better MapReduce vs Map only job performance.Why?
There are two phases normally in a MapReduce job, Map phase and Reduce phase. As the name Map, only job itself depicts that the Map only job contains only one phase, Map phase. So hence there’s no sorting and shuffling of intermediate key-value pairs involved in the process, no need of partitioner and combiner, aggregation or summation of key-value pairs is not required, so the output of mapper is directly written to HDFS . Not all jobs can be processed using map only jobs rather jobs like data parsing can be done. Therefore, map only jobs performance is better than MapReduce jobs.