Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to do partitioning in MapReduce??

How to do partitioning in MapReduce??

New Contributor
 
2 REPLIES 2

Re: How to do partitioning in MapReduce??

@Sakina MIrzaDo you mean you want to know how to write a custom partitioner in MapReduce program ?

You can follow this link. https://hadooptutorial.wikispaces.com/Custom+partitioner

Also, kindly post your questions with at least some description of what you are looking for, this will ensure you get right answers.

Re: How to do partitioning in MapReduce??

New Contributor

Partitioning of the keys of the intermediate map output is controlled by the Partitioner. By hash function, key (or a subset of the key) is used to derive the partition. According to the key value each mapper output is partitioned and records having the same key value go into the same partition (within each mapper), and then each partition is sent to a reducer. Partition class determines which partition a given (key, value) pair will go. Partition phase takes place after map phase and before reduce phase.

MapReduce job takes an input data set and produces the list of key value pair which is the result of map phase in which input data is split and each task processes the split and each map, output the list of key value pairs. Then, the output from the map phase is sent to reduce task which processes the user-defined reduce function on map outputs. But before reduce phase, partitioning of the map output take place on the basis of the key and sorted.

To know more detail about the partitioning: Partition in MapReduce

Don't have an account?
Coming from Hortonworks? Activate your account here