Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Reduce Slot Usage

Highlighted

Reduce Slot Usage

New Contributor

Reduce Slot usage is never going above 1. Is there a way to change this? We see Map slot usage going to max but reducer slot usage always stays at 1.

 

Reduce Slot Usage
1/256
1 REPLY 1

Re: Reduce Slot Usage

Master Guru
While the # of maps is driven by the amount of input you provide to a job, the # of reduces is something a job developer or a script writer has to set by themselves, to the required parallelism.

In MR jobs, the config property is typically "mapred.reduce.tasks", and the APIs for JobConf and Job both carry a setNumReduceTasks method. The defaults of this config is 1, which explains your behaviour.

Script-wise, Hive respects the above mentioned mapred.reduce.tasks for certain queries where it may be legal for it to do so, and to a certain extent it may also determine optimal number of reduces for a query by itself, and Pig on the other hand, offers a PARALLEL keyword to let users manually override number of reducers to be involved in its job chain runs.