Support Questions
Find answers, ask questions, and share your expertise

How to do partitioning in MapReduce??

 
2 REPLIES 2

@Sakina MIrzaDo you mean you want to know how to write a custom partitioner in MapReduce program ?

You can follow this link. https://hadooptutorial.wikispaces.com/Custom+partitioner

Also, kindly post your questions with at least some description of what you are looking for, this will ensure you get right answers.

Partitioning of the keys of the intermediate map output is controlled by the Partitioner. By hash function, key (or a subset of the key) is used to derive the partition. According to the key value each mapper output is partitioned and records having the same key value go into the same partition (within each mapper), and then each partition is sent to a reducer. Partition class determines which partition a given (key, value) pair will go. Partition phase takes place after map phase and before reduce phase.

MapReduce job takes an input data set and produces the list of key value pair which is the result of map phase in which input data is split and each task processes the split and each map, output the list of key value pairs. Then, the output from the map phase is sent to reduce task which processes the user-defined reduce function on map outputs. But before reduce phase, partitioning of the map output take place on the basis of the key and sorted.

To know more detail about the partitioning: Partition in MapReduce