Reply
Explorer
Posts: 6
Registered: ‎06-13-2015

Determining optimal number of reducers in Yarn.

Hi,

 

In MRv1 we had the below two configurable parameters to set the number of Map and reduce slots per Node.

 

mapred.tasktracker.map.tasks.maximum       
mapred.tasktracker.reduce.tasks.maximum

 

Also it was advisable to have number of Map slots little higher than the number of Reduce slots. Ideal number of reducers for a Map Reduce job would be equal to or greater than number of reduce slots available in the cluster.

 

Please correct if my above understanding is not correct wrt MRv1...

 

In MRv2 we dont have the concept of slots anymore, instead containers provide the required memory and CPU for Map/Reduce taks execution.

 

Here comes my question, How to decide on number of reducers for any Map Reduce job in MRv2 ?

 

 

Thanks

 

 

Explorer
Posts: 15
Registered: ‎05-07-2015

Re: Determining optimal number of reducers in Yarn.

It depends on the configuration of your servers.

 

Please provide me the configuration, I shall help you with the optimal config.

Highlighted
Explorer
Posts: 6
Registered: ‎06-13-2015

Re: Determining optimal number of reducers in Yarn.

Hi,

 

Below are the details,

 

43 Datanodes

12 cores per Node

96 GB memory per Node

12 1TB drives

 

Also please help me understand how you come up with the numbers?

 

Thanks

 

Explorer
Posts: 15
Registered: ‎05-07-2015

Re: Determining optimal number of reducers in Yarn.

http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cdh_ig_yarn_tuning.html

 

This link should answer your questions.

 

let me know if you are looking for something else.

 

Announcements