New Contributor
Posts: 6
Registered: ‎10-21-2013

Reduce Slot Usage

Reduce Slot usage is never going above 1. Is there a way to change this? We see Map slot usage going to max but reducer slot usage always stays at 1.


Reduce Slot Usage
Posts: 1,903
Kudos: 435
Solutions: 305
Registered: ‎07-31-2013

Re: Reduce Slot Usage

While the # of maps is driven by the amount of input you provide to a job, the # of reduces is something a job developer or a script writer has to set by themselves, to the required parallelism.

In MR jobs, the config property is typically "mapred.reduce.tasks", and the APIs for JobConf and Job both carry a setNumReduceTasks method. The defaults of this config is 1, which explains your behaviour.

Script-wise, Hive respects the above mentioned mapred.reduce.tasks for certain queries where it may be legal for it to do so, and to a certain extent it may also determine optimal number of reduces for a query by itself, and Pig on the other hand, offers a PARALLEL keyword to let users manually override number of reducers to be involved in its job chain runs.